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Preface 


The present book is the result of four years of work that started in Winter 2014/15 and 
was finally concluded in Summer 2018. As such, numerous hours of work went into 
this manuscript by several authors, who were all affiliated with the Pattern Recognition 
Lab of the Friedrich-Alexander-University Erlangen-Nuremberg. I truly appreciate the 
dedication and the hard work of my colleagues that led to this final manuscript and, 
although many already left the lab to take positions in academia and industry, they still 
supported the finalization of this book. 

While major parts of the book were already completed in Winter 2016/17, Springer 
gave us the opportunity to rework the book with new concepts like the geek boxes and 
new figures in order to adapt the book to a broader audience. With the present concepts, 
we hope that the book is suited to early-stage undergraduate students as well as stu- 
dents who already completed fundamental math classes and want to deepen their 
knowledge on medical imaging. We believe, the time to improve the manuscript was 
well spent and the final polish gave rise to a textbook with a coherent story line. In 
particular, we break with the historical development of the described imaging devices 
and present, e. g., magnetic resonance imaging before computed tomography, although 
they were developed in opposite order. A closer look reveals that this change of order is 
reasonable for didactical purposes: magnetic resonance imaging relies mainly on the 
Fourier transform, while computed tomography requires understanding of the Fourier 
slice theorem discovered by Johann Radon. These observations then also mend the 
apparent historical disorder, as we celebrate Joseph Fourier’s 250" birthday this year 
and celebrated the 100" birthday of the Radon transform last year. 

We also tried to find many graphical explanations for many of the mathematical 
operations such that the book does not require complete understanding of all mathe- 
matical details. Yet, we also offer details and references to further literature in the 
previously mentioned geek boxes as students in the later semesters also need to be 
familiar with these concepts. In conclusion, we hope that we created a useful textbook 
that will be accessible to many readers. In order to improve this ease of access further, 
we chose to publish the entire manuscript as open access book under Creative Com- 
mons Attribution 4.0 International License. Thus, any information in this book can 
shared, copied, adapted, or remixed even for commercial purposes as long as the 
original source is appropriately referenced and a link to the license is provided. 


June 2018 Andreas Maier 
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Chapter 1 


Introduction 


Author: Andreas Maier 


The design and manufacturing of modern medical devices requires knowl- 
edge of several disciplines, ranging from physics, over material science, to 
computer science. Thus, designing a single lecture as an introduction to med- 
ical engineering faces a lot of challenges. Nonetheless, the manuscript Medi- 
cal Imaging Systems — An Introductory Guide aims at being a complete and 
comprehensive introduction to this field for students in the early semesters. 
Medical imaging devices are by now an integral part of modern medicine, 
and have probably already been encountered by all students in their personal 
life. 

This book does not simply summarize the content of the lecture held in 
Erlangen. Instead, it should be understood as additional material to gain a 
better understanding of the theory that is covered in the lecture. To give a 
complete introduction, the lecture notes also cover basic math and physics 
that are required to understand the underlying principles of the imaging 
devices. However, we try to limit this to the very basics. Obviously, this 
is not sufficient to describe everything in the appropriate level of detail. For 
this reason, we introduced geek boxes (cf. Geek Box 1.1) that contain optional 
additional background information. This concept will be used in all chapters 
of the book which are summarised in the following sections. 

Chap. 2 and 3 of this book cover an introduction to signal and image pro- 
cessing. Chap. 2 introduces the concepts of filtering, convolution, and Fourier 
transforms for 1-D signals, all of which are fundamental tools that are later 
on used across the entire book. We try to explain why these concepts are 
required and as most image processing is digital also emphasize the discrete 
algorithmic counter parts. At the beginning of Chap. 3, the transition to im- 
ages is made, and therefore also the transition from 1-D to 2-D. The chapter 
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Geek Box 1.1: Geek Boxes 


We designed the manuscript to be readable from the first semester 
on. However, we felt that we need to demonstrate that there is much 
more depth that we could go into. In order not to confuse a less 
experienced reader, we omitted most equations and math from the 
main text and relocated them to geek boxes that go into more detail 
and give references to further reading. In addition, we also refresh 


concepts that are already known to most readers. Nonetheless, the 
important concepts are already mentioned in the main text. This way, 
the reader can return to this book at a time when these concepts are 
introduced, e. g., in more advanced math courses seemingly unrelated 
to medical imaging. As such this book can be read twice: once omitting 
all geek boxes to get an overview on the field and a second time with 
a more throrough focus on the mathematical details. 


covers the basics of image processing and explains how different image trans- 
formations such as edge detection and blurring are implemented as image 
filters using convolution. 

The following chapters cover examples for imaging devices using stan- 
dard optics. In this book, endoscopy and microscopy are discussed as typical 
modalities of this genre. Endoscopes, see Chap. 4, were among the first med- 
ical imaging devices that were used. Images can be acquired by using long 
and flexible optical fibers that are able to transport visible light through the 
body of a patient. 

Microscopes also use visible light. However, tissue samples or cells have to 
be extracted from the body first, e. g., in a biopsy. Then the microscope’s op- 
tics are used to acquire images at high magnifications that allow the imaging 
of individual cells and even smaller structures. Microscopes and the principles 
of optics are described in Chap. 5. 

Magnetic resonance imaging (MRI), see Chap. 6 uses electromagnetic 
waves to excite water atoms inside the human body. Once the excitation 
is stopped, the atoms return to their normal state and by doing so emit 
the same electromagnetic radio wave that was used to excite them. This ef- 
fect is called nuclear magnetic resonance. Using this effect, an MRI image is 
obtained. Fig. 1.1 shows a state-of-the-art MR scanner. 

X-ray imaging devices, see Chap. 7, use light of very high energy. However, 
the light is no longer visible for the human eye. The higher energy of the light 
allows for a deeper penetration of the body. Due to different absorption rates 
of X-rays, different body tissues can be distinguished on X-ray images. Tissues 
with high X-ray absorption, e. g., bones, become visible as bright structures 
in X-ray projection images. Today, X-rays are among the most widely spread 
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Figure 1.1: MRI is based on nuclear magnetic resonance which does not 
involve ionizing radiation. For this reason MRI is often used in pediatric 
applications. Image courtesy of Siemens Healthineers AG. 


25 


MOBILETT XP Hybrid B 


ye 


Figure 1.2: X-ray projection images are one of the most wide-spread imaging 
modalities. Image courtesy of Siemens Healthineers AG. 
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Figure 1.3: Modern CT systems allow even scanning of the beating heart. 
Image courtesy of Siemens Healthineers AG. 


medical imaging technologies. An example for an X-ray imaging device is 
shown in Fig. 1.2. 

Computed tomography (CT) uses X-rays to reconstruct slice and volume 
data as described in Chap. 8. The total absorption along the path of an X-ray 
through the body is actually given by the sum of absorptions by tissues with 
different absorption characteristics along its path. Thus, a measurement of 
the absorptions of X-rays from different directions allows for a reconstruction 
of slice images through the patient’s body. In doing so, much better contrast 
between types of soft tissue is obtained. One is even able to differentiate 
between different tissue types such as brain and brain tumor. Once several 
slices are combined, the entire volume can be reconstructed by stacking the 
slices, which is then referred to as a 3-D image. Fig. 1.3 shows a state-of-the- 
art CT system with a gantry that rotates at 4 Hz. 

X-rays essentially are electromagnetic waves that can be described by their 
amplitude, wavelength, and phase. Phase contrast imaging exploits the effect 
that an X-ray passing through tissue is not only influenced by absorption, 
but that also the phase of the electromagnetic wave is shifted. Chap. 9 shows 
that the phase shift of X-rays can be used to visualize the tissue the X- 
rays have passed. Today, phase contrast imaging is not yet used in clinical 
practice. In fact, due to the high requirements on the type of irradiation, such 
images often require a synchrotron as the source of the radiation. However, 
new developments in research now allow to generate phase contrast images 
using a normal clinical X-ray tube, which renders the application clinically 
feasible. At present, technical limitations allow only the scanning of small 
specimen such as peanuts and the mechanical design is still challenging. First 
image results indicate that the modality might be of high clinical relevance. 
Fig. 1.4 shows the reconstruction of peanut fibers that are in the range of 
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Figure 1.4: An X-ray dark-field setup can be used to reconstruct the ori- 
entation of fibers that are smaller than the detector resolution. The image 
on the left shows the reconstructed fiber orientation in different layers of a 
peanut. The image on the right shows a microscopic visualization of the waist 
of the peanut (picture courtesy of ECAP Erlangen). 
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Figure 1.5: Modern SPECT/CT systems combine different modalities to 
achieve multi-modal imaging. Image courtesy of Siemens Healthineers AG. 


several micrometers. Phase contrast allows for a reconstruction of these fibers, 
although the resolution of the used imaging device based on the absorption 
of X-rays was only 0.1 mm. 

Emission tomography, described in Chap. 10, is used for imaging different 
bodily functions. It uses tracers, which are molecules that are marked with 
radioactive atoms. For example one can introduce a radioactive atom into a 
sugar molecule. When this tracer is consumed by the body it will follow the 
normal metabolism, and its path through the body can be followed. While 
sugar consumption is normal in certain parts of the body such as the muscles 
or the brain, tumors also require a lot of sugar for their growth. Thus, emis- 


12 1 Introduction 


Figure 1.6: A typical ultrasound system as it can be found in clinics world- 
wide. Image courtesy of Siemens Healthineers AG. 


sion tomography enables us to see anomalies in sugar consumption within 
the body which is useful to spot tumors or metastases. Fig. 1.5 shows a com- 
bined single-photon emission computed tomography (SPECT) / CT system 
that combines emission tomography with X-ray CT. 

Ultrasound (US) uses high-frequency sound waves to penetrate bodily tis- 
sue. The sound waves are emitted from a probe that is in direct contact with 
the body. The same probe is then also used to measure the reflections of the 
sound waves. Given the time between the emission of the sound wave and the 
measurement of the reflection, one is able to reconstruct how deep the wave 
penetrated the tissue. US is one of the most wide-spread imaging modalities 
as it is rather inexpensive compared to other imaging modalities. Fig. 1.6 
shows a clinical ultrasound system. 

The measurement principle of optical coherence tomography (OCT) is 
quite similar to US. However, light waves are used instead of sound waves. 
Thus, the measurement process needs to be performed at much higher speed 
and penetration depth is much lower than in the case of US. Most applications 
are in eye imaging where 3-D images of the eye are generated. 
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System Theory 
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In the digital age, any medical image needs to be transformed from contin- 
uous domain to discrete domain (i.e. 1’s and 0’s) in order to be represented 
in a computer. To do so, we have to understand what a continuous and a 
discrete signal is. Both of them are handled by systems which will also 
be introduced in this chapter. Another fundamental concept is the Fourier 
transform as it allows us to represent any time domain signal in frequency 
space. In particular, we will find that both representations — time domain 
and frequency domain — are equivalent and can be converted into each other. 
Having found this important relationship, we can then determine conditions 
which will guarantee that also conversion from continuous to discrete domain 
and vice versa is possible without loss of information. On the way, we will 
introduce several other important concepts that will also find repeated use 
later in this book. 


© The Author(s) 2018 
A. Maier et al. (Eds.): Medical Imaging Systems, LNCS 11111, pp. 13-36, 2018. 
https://doi.org/10.1007/978-3-319-96520-8 2 


14 2 System Theory 


2.1 Signals and Systems 


2.1.1 Signals 


A signal is a function f (t) that represents information. Often, the indepen- 
dent variable t is a physical dimension, like time or space. The output f of the 
signal is also called the dependent variable. Signals are everywhere in every- 
day life, although we are mostly not aware of them. A very prominent example 
is the speech signal, where the independent variable is time. The dependent 
variable is the electric signal that is created by measuring the changes of air 
pressure using a microphone. The description of the speech generation pro- 
cess enables to do efficient speech processing, e. g., radio transmission, speech 
coding, denoising, speech recognition, and many more. In general, many do- 
mains can be described using system theory, e. g., biology, society, economy. 
For our application, we are mainly interested in medical signals. 

Both the dependent and the independent variable can be multidimen- 
sional. Multidimensional independent variables t are very common in images. 
In normal camera images, space is described using two spatial coordinates. 
However, medical images, e. g., CT volume scans, can also have three spatial 
dimensions. It is not necessary that all dimensions have the same meaning. 
Videos have two spatial coordinates and one time coordinate. In the medi- 
cal domain, we can also find higher-dimensional examples like time-resolved 
4-D MR and CT with three spatial dimensions and one time dimension. To 
represent multidimensional values, i.e., vectors, we use bold-face letters t or 
multiple scalar values, e.g., t = (x,y,z)'. The medical field also contains 
examples of multidimensional dependent variables f. An example with many 
dimensions is the Electroencephalography (EEG). Electrodes are attached to 
the skull and measure electrical brain activity from multiple positions over 
time. To represent multidimensional dependent variables, we also use bold- 
face letters f. 

The signals described above are all in continuous domain, e. g., time and 
space change continuously. Also, the dependent variables vary continuously 
in principle, like light intensity and electrical voltage. However, some sig- 
nals exist naturally in discrete domains w.r.t. the independent variable or 
the dependent variable. An example for a discrete signal in dependent and 
independent variable is the number of first semester students in medical en- 
gineering. The independent variable time is discrete in this case. The starting 
semesters are WS 2009, WS 2010, WS 2011, and so on. Other points in time 
are considered to be constant in this interval. The number of students is 
restricted to natural numbers. In general, it is also possible that only the de- 
pendent or the independent variable is discrete and the other one continuous. 
In addition to signals that are discrete by nature, other signals must be rep- 
resented discretely for processing with a digital computer, which means that 
the independent variable must be discretized before processing with a com- 
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Figure 2.1: A system H{.} with the input signal f(¢) and the output signal 
g(t). 


puter. Furthermore, data storage in computers has limited precision, which 
means that the dependent variable must be discrete. Both are a direct conse- 
quence of the finite memory and processing speed of computers. This is the 
reason why discrete system theory is very important in practice. 

Signals can be further categorized into deterministic and stochastic signals. 
For a deterministic signal, the whole waveform is known and can be written 
down as a function. In contrast, stochastic signals depend randomly on the 
independent variable, e. g., if the signal is corrupted by noise. Therefore, for 
practical applications, the stochastic properties of signals are very important. 
Nevertheless, deterministic signals are important to analyze the behavior of 
systems. A short introduction into stochastic signals and randomness will be 
given in Sec. 2.4.3. 

This chapter is presents basic knowledge on how to represent, analyze, 
and process signals. The correct processing of signals requires some math 
and theory. A more in-depth introduction into the concepts presented here 
can be found in [3]. The application to medical data is treated in [2]. 


2.1.2 Systems 


Signals are processed in processes or devices, which are abstracted as sys- 
tems. This includes not only technical devices, but natural processes like 
attenuation and reverberation of speech in transmission through air as well. 
Systems have signals as input and as output. Inside the system, the properties 
of the signal are changed or signals are related to each other. We describe the 
processing of a signal using a system with the operator H{-} that is applied 
to the function f. A graphical representation of a system is shown in Fig. 2.1. 

An important subtype is the linear shift-invariant system. Linear shift- 
invariant systems are characterized by the two important properties of lin- 
earity and shift-invariance (cf. Geek Box 2.1 and 2.2). 

Another property important for the practical realization of linear shift- 
invariant systems is causality. A causal system does not react to the input 
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Geek Box 2.1: Linear Systems 


The linearity property of a system means that linear combinations 
of inputs can be represented as the same linear combination of the 
processed inputs 


H{af(t)} = aH{f(t)} (2.1) 
HQ) + 9(0) = HAF + Hig), (2.2) 


with constant a and arbitrary signals f and g. The linearity property 
greatly simplifies the mathematical and practical treatment, as the 
behavior of the system can be studied on basic signals. The behav- 
ior on more complex signals can be inferred directly if they can be 
represented as a superposition of the basic signals. 


Geek Box 2.2: Shift-Invariant Systems 


Shift-invariance denotes the characteristic of a system that its re- 
sponse is independent of shifts of the independent variable of the 
signal. Mathematically, this is described as 


alt) = H{f(t)} 
g(t) = H{ f(t — 7)) 


( 
( 


gi(t — 7) = go(t), 


for the shift 7. This means that shifting the signal by 7 followed by 
processing with the system is identical to processing the signal with 
the system followed by a shift with T. 


before the input actually arrives in the system. This is especially important 
for signals with time as the independent parameter. However, non-causal 
systems do not pose a problem for the independent parameter space, e.g., 
image filters that use information from the left and right of a pixel. Geek 
Box 2.3 presents examples for the combination of different system properties. 

Linear shift-invariant systems are important in practice and have conve- 
nient properties and a rich theory. For linear shift-invariant systems, the ab- 
stract operator H{-} can be described completely using the impulse response 
h (t) (cf. Sec. 2.2.2) or transfer function H (£) (cf. Sec. 2.3.2). The impulse 
response is combined with the signal by the operation of convolution. This is 
sufficient to describe all linear shift-invariant systems. 
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Geek Box 2.3: System Examples 


Here are some examples of different systems analyzed w. r.t. linearity, 
shift-invariance, and causality. f(t) represents the input and g(t) the 
output signal. 


= 3f(t + 2): linear, shift-invariant, non-causal 


(t) — 2f (t — 1): linear, shift-invariant, causal 
(t) - e 7959: linear, not shift-invariant, causal 


f 
J 


2.2 Convolution and Correlation 


This section describes the combination of signals in linear-shift-invariant sys- 
tems, i.e., convolution or correlation. Before discussing signal processing in 
detail, we will first start by revisiting important mathematical concepts that 
will be needed in the following chapters. 


2.2.1 Complex Numbers 


Complex numbers are an extension to real numbers. They are defined as 
z = a + bi. a is called the real part of z and b the imaginary part. Both 
act as coordinates in a 2-D space. i is the imaginary unit that spans the 
second dimension of this space. The special meaning of i is that i? — —1. 
'This makes complex numbers important for many areas in mathematics, but 
also in many applied fields like physics and electrical engineering. To extract 
the coordinates of the complex number, we use the following definitions 


a — Re(z) (2.6) 
b = Im (2). (2.7) 


We can directly write z = Re (z) + Im (z) i. Another important definition is 
the complex conjugate z, which is the same number as z except with the 
opposite sign for the imaginary part z = a — bi. 

Real numbers are the subset of the complex numbers for which b — 0, i.e., 
no imaginary part. Geometrically, this means that real numbers are defined 
on a one-dimensional axis, whereas the complex numbers are defined on a 
2-D plane. The geometric interpretation of complex numbers is also helpful 
to see the equivalence of the Cartesian coordinate notation z = a+ bi and 
the polar coordinate notation z = A (cos ¢ + isin d) of complex numbers. The 
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Geek Box 2.4: Complex Numbers and Geometric Interpretation 


If a point on the 2-D plane is seen as a position vector, A is the 
length of the vector and ¢ the angle relative to the real axis. The two 
notations can be converted to each other using the following formulas: 


ZEE 
arctan 2, ifa>0 
arctan 2 + m, ifa<Oandb>0 
arctan P — r, ifa «O0 and b «0 
ifa=O0andb>0 
23 ifa—-0andb«0 
undefined, if a = 0 and b = 0 
a= Acos ġ 
b — Asinó 


T 
2 


polar coordinates consists of magnitude A and angle ¢ (cf. Geek Box 2.4). For 
system theory, an important property of complex numbers is Euler's formula 


exp (i$) = e’? = cos(¢) + isin(¢). (2.8) 


Using this relation, a complex sum of sine and cosine can be expressed con- 
veniently using a single exponential function. This leads directly to the ex- 
ponential notation of complex numbers z = Ae’®. We will use the complex 
numbers and different notations in Sec. 2.3. 
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Description Equation 

Linearity g(t) * (a- f(t) + b- h(t)) = allg * FXE) + B((g * A) (t)) 
Shift-invariance g(t) * f(t — 7) = (g * f)(t—7) 
Commutativity g(t) * f(t) = F(t) * g(t) 

Associativity g(t) * (CF * h)(t)) = (CF * g)(t)) * h(t) 
Distributivity FE * (IE) + h@) = (f * 9)(0 + (F * hb) 


Table 2.1: Some mathematical properties of convolution. a, b are constants. 


2.2.2 Convolution 


As mentioned above, convolution is the operation that is necessary to describe 
the processing of any signal with a linear shift-invariant system. Convolution 
in the continuous case is defined as 


oo 
gt) = (hx f= f iore nar. (2.9 

—00 
In order for the convolution to be well-defined, some requirements for the 
functions h and f must be fulfilled. For the infinite integral to exist, h and 
f must decay fast enough towards infinity. This is the case if one of the 
functions has compact support, i.e., it is 0 everywhere except for a limited 
region. As an example, the convolution of a square input function f(t) with an 
Gaussian function h(t) is investigated in Geek Box 2.5. Further mathematical 

properties of convolution are listed in Table 2.1. 

A common basic signal is the Dirac function which is also called delta 
function or impulse function. It is a infinitely short, infinitely high impulse. 


TE a ouis (2.10) 


0, otherwise 


It is impossible to describe the Dirac function using classical functions. 
It requires the use of generalized functions or distributions, which is out 
of the scope of this introduction. The Dirac function is usually represented 
graphically as an arrow of length 1, see Fig. 2.2. 

Sequences of Dirac pulses are useful to select only certain points of a 
function like a sifter (cf. Figure 2.3). The sifting property of the Dirac function 
is given by integrating the product of a function and a time-delayed Dirac 
function 
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Geek Box 2.5: Convolution Example 


— Input signal f(t) 
— Impulse response h(t) |7 
— Output signal g(t) 


For the definition of the square function, the Heaviside step function 
is useful to shorten the notation 


if 
so-[ ift «0 


1, otherwise ` 


Then, the square function and the Gaussian are defined as 


oo 


kk; M) H(t—nT)— H (t —nT — ks) 


with the offset kı, the amplitude k2, the duty-cycle k3, and the period 
T of the square function and the standard deviation c of the Gaussian. 
'The convolution with a Gaussian results in a smoothing of the edges 
of the square function. 


With the sifting property, the element at t — T' can be selected from the 
function, which is equivalent to sampling the function at that time point. 

The sift property is useful for convolution of an arbitrary function and the 
Dirac function. 


f(t)«ó(t— T) = f(r)ó(t— T —7)dvr = f(t— T) (2.11) 


Consequently, the Dirac function is the identity element of convolution. 
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— Dirac impulse 6(t) 


Figure 2.2: Graphical representation of the Dirac function ó(t). The arrow 
symbolizes infinity. 


Figure 2.3: Laboratory sifters are used to remove undesired parts from 
discrete signals. Sequences of Dirac pulses can be applied in a similar way. 
Image courtesy of BMK Wikimedia. 


The response of a system to a Dirac function on the input is called the 
impulse response of the system h(t) = H{d(t)}. Using the superposition 
principle, every other signal can be represented as a linear combination of 
infinitely many Dirac functions. Therefore, the output of a system to any 
input signal is computed by convolution of the input signal f(t) with the 
impulse response A(t). 


g(t) = f(t) * h(t) (2.12) 


For medical applications, an important example of a linear shift-invariant 
system is an imaging system. The output of an imaging system is often mod- 
eled as a linear shift-invariant system. The impulse response of an imaging 
system is called point spread function. It describes how a single point, i.e., a 
Dirac impulse, is spread on the sensor plane by the specific imaging system. 
The point spread function is a description of the behavior of the system. 
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2.2.3 Correlation 


Another basic operation to combine a signal and a system is correlation 


oo 
g(t) = (hx f)(t) = I h*(r)f(t-- 7T) dr, (2.13) 
— oo 
where h* is the complex conjugate of h. The main difference to convolution 
is that the input signal f is not mirrored before combination with h, i.e., 
f (t 4- 7) instead of f(t — 7). Correlation is a way to measure the similarity of 
two signals. 

An application of correlation is the matched filter. The matched filter is 
specifically designed to have a high response for a specific deterministic signal 
or waveform f(t). It is matched to that signal. The matched filter is directly 
computed by correlation with the desired signal. Alternatively, convolution 
with an impulse response of the mirrored, complex conjugate of the desired 
deterministic signal h(t) — f*(—t) can be used. 

Technical uses for correlation can be found in signal transmission and 
signal detection. For a medical example, the heartbeats of a person can be 
detected in an Electrocardiogram (ECG) using correlation with a template 
QRS complex (QRS complex denotes the combination of three of the graphi- 
cal deflections seen on an ECG). In image processing, a certain deterministic 
signal is searched for across the whole image. In this case, the deterministic 
signal is often called template and the process of searching is called tem- 
plate matching. This can be used for the detection of specific structures and 
tracking of structures over time. Geek Box 2.6 puts the correlation in signal 
processing in relation to the statistical correlation coefficient. 


2.3 Fourier Transform 


Up to this point, all operations and mathematical definitions were performed 
in continuous domain. Also, we have not discussed the relation between dis- 
crete and continuous representations which are important to understand the 
concept of sampling. In the following, we will introduce the Fourier transform 
and related concepts which will allow us to deal with exactly such problems. 


2.3.1 Types of Fourier Transforms 


A cosine wave f of time t with amplitude A, frequency £, and phase shift y 
can be described by the following three equivalent parametrizations. 
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Geek Box 2.6: Relation to the Statistical Correlation Coefficient 


„patient 
regression line 


expert score 


20 30 40 50 
word recognition rate 


In statistics, the so-called Pearson correlation coefficient r [5] is a mea- 
sure of agreement between two sets of observations x and y. Coeffi- 
cient r is defined in the interval [—1, 1] and if |r| = 1, a perfect linear 
relationship between the two variables is present. It is computed in 
the following way: 


len US) 


Ogz0y 


r(z,y) = 


Here, we use 7, Y, Cx, and c, to denote the respective mean values and 
standard deviations. If we assume the standard deviations to be equal 
to 1 and the means equal to 0, we arrive at the following equation: 


r(x, y) = DES “Un 


This is identical to the discrete version of correlation for real inputs 
for t = 0. Also note that this can be considered simply as an inner 
product a! y. 

The image at the top of the page shows a scatter plot between two 
variables word recognition rate and expert rater. Each point (an, Yn) 
denotes one patient for whom both of the two variables were measured. 
The closer the two are to the dotted line, the better their agreement. 
Here, their dependency is negative as if one variable is high, the other 
is low and vice versa. r ~ —0.9 in this example. Please refer to [4] for 
more details. 
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€ Fourier coefficients of f(t) @ Coeff. at 1/2 — Original signal ft) —— Frequs. 0, 1/2 
0.7 © Coefficient at 0 (offset) @ Coeff. at 3/2 — Frequency 0 (offset) —  Frequs. 0, 1/2, 3/2 
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(a) Fourier coefficients, weights of trigono- (b) Periodic signal and approximations us- 
metric functions approximating the signal ing different numbers of Fourier coefficients 


f(t) 


Figure 2.4: Approximation of a periodic signal using a weighted sum of 
trigonometric functions 


f(t) = A. cos(27&t + p) A,pcm 
= a-cos(27ét) + b - sin(2n€t) a,bE R 
=c- e27i&t LG. e 27i&t cec 


In Geek Box 2.7, we show how the parameters a, b, and c are related to A 
and g. 

A Fourier series (cf. Geek Box 2.8) is used to represent a continuous 
signal using only discrete frequencies. As such a Fourier series is able to ap- 
proximate any signal as a superposition of sine and cosine waves. Fig. 2.4(b) 
shows a rectangular signal of time. The absolute values of its Fourier coeffi- 
cients are depicted in Fig. 2.4(a). As can be seen in Fig. 2.4(a), the Fourier 
coefficients decrease as the frequency increases. It is therefore possible to ap- 
proximate the signal by setting the coefficients to 0 for all high frequencies. 
Fig. 2.4(b) includes the approximations for three different choices of sets of 
frequencies. 

'The Fourier series, which works on periodic signals, can be extended to 
aperiodic signals by increasing the period length to infinity. The resulting 
transform is called continuous Fourier transform (or simply Fourier trans- 
form, cf. Geek Box 2.9). Fig. 2.5(b) shows the Fourier transform of a rectan- 
gular function, which is identical to the Fourier coefficients at the respective 
frequencies up to scaling (see Fig. 2.5(a)). 

'The counter part to the Fourier series for cases in which time domain is 
discrete and the frequency domain is continuous is called the discrete time 
Fourier transform (cf. Geek Box 2.10). It forms a step towards the dis- 
crete Fourier transform (cf. Geek Box 2.11) which allows us to perform 
all previous operations also in a digital signal processing system. In discrete 
space, we can interpret the Fourier transform simply as a matrix multiplica- 
tion with a complex matrix F 
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Name Function Fourier transform 
0 if |at| > à 

Rectangular rect(at)= 4 i if |at| = 4 F {rect (t)] (£) = ni sinc(£) 
1 iflat| « i 


bejü ied 


Triangular tri(t) = $ if lt <1 F [tri(t)] (£) = sinc? (£) 


2 


Gaussian gauss(t) = e7% F [gauss(t)] (£) = mecs /a 


Table 2.2: Fourier transforms of popular functions. Here we use the defini- 
tion sinc(z) = #272), Note that a convolution of two rectangular functions 


yields a triangular function as F [rect (t) x rect(t)] = sinc? (£). 


k= Fn (2.14) 


where the signal n and the discrete spectrum k are vectors of complex values. 
The inverse operation is then readily found as 


n= F"k (2.15) 


where F is the Hermitian, i. e., transposed and element-wise conjugated, of 
F. Geek Box 2.12 shows some more details on how to find these relations. 
Fig. 2.5 shows all types of Fourier transforms introduced in this section in 
comparison. Tab. 2.2 shows the Fourier transforms of popular functions. 

In computer programs, discrete Fourier transforms are implemented very 
efficiently using fast Fourier transform (FFT). This approach reduces the 
number of computations from the order of N? to the order of N log N, if N 
is the length of the signal. In the next section, we will see why convolution 
and correlation also benefit from this efficiency. 


2.3.2 Convolution Theorem & Properties 


The convolution of two functions f and g is defined as in Sec. 2.2.2, and - 
denotes point-wise multiplication. The convolution theorem states that a con- 
volution of two signals in space is identical to a point-wise multiplication of their 
spectra (see Equation 2.24). The opposite also holds true (see Equation 2.25). 


Fif«g}=F-G (2.24) 
F{f gh} =F *G (2.25) 
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Geek Box 2.7: Equivalent Cosine Representations 


Oscillations of the same frequency can be represented in several equiv- 
alent ways. In the following, we make use of the complex numbers 
introduced in Sec. 2.2.1 and the correspondence between a sum of 
complex exponentials and the real part z +Z = 2 Re(z) to convert the 
different representations into the same expression. 


Amplitude and phase shift, where we define c = 1 Ae’? : 


f(t) 2 A-cos(27ét + p) = Re (A : goes 


Sum of cosine and sine functions, where we define c = 2 (a — ib): 
- b- sin(27&t) 

- b. cos(27ét — 1/2) 

L Re (b. e2ri&t— He) 

- Re (b. e2miét vecina) 


—áb- e?rt&t) 


= ) " eo Re (2c - gre 


Sum of complex exponentials: 
f(t) ee e27i6t JE (ae e ?ni&t 


— Re (c : eee feta (c: Eun MRE (c: en) = lea (c : e2rigt) 


2.3 Fourier Transform 07 


Geek Box 2.8: Fourier Series 


The Fourier series (Equation 2.17) represents a periodic signal of pe- 
riod T' by an infinite weighted sum of shifted cosine functions of differ- 
ent frequencies. The Fourier coefficients c are calculated using Equa- 
tion 2.16. 


d+T l 
| OS dt keZ (2.16) 
d 


Set tER (2.17) 


k=—90 


The coefficients c[k] and c|—k] together form a shifted cosine wave with 


frequency é = 3 (see Geek Box 2.7). It follows that c[—k] = c[k]: 


c[k] e2titk/T 1 c[-k] e 2ritk/T = c[k] e2ritk/T + c[k] e 2ritk/T 


c[-k] e 2ritk/T = c[k] e 2ritk/T 


=> c[-k] = e| 


Geek Box 2.9: Continuous Fourier Transform 


Given a time-dependent signal f, its Fourier transform F at frequency 
£ is defined by Eq. (2.18). The inverse Fourier transform is defined by 
Eq. (2.19). 


F(é) = i B fies = dt EER (2.18) 
f= / B F(E) e? "E qe tcm (2.19) 


In general, f(t) can be a complex signal. We will, however, only con- 
sider the case where f(t) is real-valued. The continuous Fourier trans- 
form is symbolized by the operator F. 
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Geek Box 2.10: Discrete-time Fourier Transform 


The spectrum (i.e., continuous Fourier transform) of a band-limited 
signal that is sampled equidistantly and sufficiently dense with dis- 
tance T can be calculated using the discrete-time Fourier transform 
(DTFT) defined by Equation 2.20. The inverse transform is given by 
Equation 2.21. For details about the required sampling distance see 
Sec. 2.4.2. 


Fy = So flr] ener £cR — (230) 


d+% 
fin) =T i ENORA le neZ (2.21) 


Fig. 2.5(c) shows the DTFT of a band-limited function and the Fourier 
transform. The DTFT is identical to the Fourier transform up to 
scaling except that it is periodic with period 1/T. 


Geek Box 2.11: Discrete Fourier Transform 


The spectrum of a periodic and band-limited signal can be calcu- 
lated with the discrete Fourier transform (DFT) as defined by Equa- 
tion 2.22. The signal can be reconstructed with the inverse DFT as 
defined by Equation 2.23. 


N-1 


das cn bez (2.22) 


tie nez (23) 
0 


Fig. 2.5(d) shows the DFT and the Fourier series of a band-limited 
signal. The DFT is identical to the Fourier series up to scaling except 
that it is periodic with period 1/N. 
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Geek Box 2.12: Discrete Fourier Transform as Matrix 


A discrete Fourier transform can be rewritten as a complex matrix 
product. To demonstrate this, we start with the definition of the dis- 
crete Fourier transform: 


fin] e 2rink/N 


e 
0 


—2mink/N f [n] 


Now, we replace the summation with an inner product of two vectors 
Ep and n (cf. Geek Box 2.6): 


F|k] = (eo ca ae on JET EAN =N) 


fIN - 1] 


We see that £, is a discretely sampled wave at frequency k. This 
equation can now be interpreted as the k-th row of a matrix vector 
product. Thus, we can rewrite the entire discrete Fourier transform of 
all K frequencies to 


F[K — 1] £k 


As such, each row of the above matrix multiplication computes a 
correlation between a wave of frequency k for all K frequencies under 
consideration. Furthermore the relation FF = F^! holds if F is 
scaled with X. Hence, F forms an orthonormal basis. If we con- 
tinue this line of thought, we can also interpret a Fourier transform 
as a basis rotation. In our case, we do not rotate by a certain angle, 
but we project our time-dependent signal into a frequency resolved 
time-independent space. 
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Figure 2.5: Different types of Fourier transforms. 


A similar theorem exists for the DFT. Let Cp denote the matrix that per- 
forms the convolution with discrete impulse response h, and f be a discrete 
input signal. Then system output g is obtained as 


g=hx«f =C,f = F"HFf. 


where H is a diagonal matrix that contains the Fourier transformed coeffi- 
cients of h. Note that F and FP can be implemented efficiently by means 
of FFT. In addition to the convolution theorem, the Fourier transform has 
other notable properties. Some of those properties are listed in Table 2.3. 
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Description Time Frequency 
Linearity a: f(t) +6- g(t) a-F(€) --b- G(£) 
Shift f(t — a) gm 
Scali t does 
caling f (at) Tu) 
Derivative d" f(t) (2ri£)" F(£) 
dt" 

Convolution theorem (see Sec. 2.3.2) (f * g)(t) F(£)-G(£) 
Dual of the convolution theorem f(t) - g(t) (F * G)(£) 


Table 2.3: Effects of modifications of a signal in time on the Fourier trans- 
form. 
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Figure 2.6: Discrete system theory 


2.4 Discrete System Theory 


2.4.1 Motivation 


As already indicated in the introduction, discrete signals and systems are very 
important in practice. All signals can only be stored and processed at fixed 
discrete time instances in a digital computer. The process of transforming 
a continuous time signal to a discrete time signal is called sampling. In 
the simplest and most common case, the continuous signal is sampled at 
regular intervals, which is called uniform sampling. The current value of the 
continuous signal is stored exactly at the time instance where the discrete 
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time signal is defined. This can be modeled by a convolution with an impulse 
train, see Fig. 2.6(a). At first glance, it looks like a lot of information is 
discarded in the process of sampling. However, under certain requirements, 
the continuous time signal can be reconstructed exactly. Further details are 
given in Sec. 2.4.2. 

As we have already seen with the discrete Fourier transform, most meth- 
ods introduced in this Chapter can be equally applied to discrete signals. 
We denote discrete signals using brackets || instead of parentheses (), as we 
already did in the Geek Boxes. Integrals must be replaced by infinite sums, 
for example for the discrete convolution 


gn] = (h* f)In]= J Ale] ffm - k. (2.26) 


k=—0o 


In the discrete case, the Dirac function takes on a simple form. 


1, ifn= 
ieee (2.27) 
0, otherwise 


Note that in contrast to the continuous Dirac function, it is possible to exactly 
represent and implement the discrete Dirac function. 

In addition to the discrete independent variable, the dependent variable 
can also be discrete. This means that the signal value f(t) or f[n] can only 
take values of certain levels. Apart from naturally discrete signals, all signals 
must be converted to a fixed discrete value for representation and processing 
in digital computers. For example, image intensities are often represented 
in the computer using 8 bit, i.e., 256 different intensities, or 12 bit which 
corresponds to 4096 different levels. The process of transforming a continuous- 
valued signal to a discrete-valued signal is called quantization. In most 
cases, a uniform quantization is sufficient, which means that the discrete 
levels have equal distance from each other. The continuous-valued signal is 
rounded to the nearest discrete level available, see Fig. 2.6(b). The error 
arising during this process is called quantization noise. Some more details on 
noise and noise models are given in Sec. 2.4.3. 


2.4.2 Sampling Theorem 


The Nyquist-Shannon sampling theorem (or just sampling theorem) states 
that a band-limited signal, i.e., a signal where all frequencies above £p and 
below —£p are zero, can be fully reconstructed using samples 1/(2£5) apart. 
If we consider a sine wave of frequency £p, we have to sample it at least with 
a frequency of 2 £g, i. e. twice per wavelength. 
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Figure 2.7: Sampling a sine signal with a frequency below 2 £g will cause 
aliasing. The reconstructed sine wave shown with blue dashes does not match 
the original frequency shown in red. 


Formally, the theorem can be derived using the periodicity of the DTFT 
(see Fig. 2.5(c)). The DTFT spectrum is a periodic summation of the original 
spectrum, and the periodic spectra do not overlap as long as the sampling 
theorem is fulfilled. It is therefore possible to obtain the original spectrum 
by setting the DTFT spectrum to zero for frequencies larger than B. The 
signal can then be reconstructed by applying the inverse Fourier transform. 
We refer to [3] for a more detailed description of this topic. 

So far, we have not discussed how the actual sampling frequency 2 £p is de- 
termined. Luckily such a band limitation can be found for most applications. 
For example, even the most sensitive ears cannot perceive frequencies above 
22kHz. As a result, the sampling frequency of the compact disc (CD) was 
determined at 44.1kHz. For the eye, typically 300 dots per inch in printing 
or 300 pixels per inch for displays are considered as sufficient to prevent any 
visible distortions. In videos and films, a frame rate of 50 Hz is often used to 
diminish flicker. High fidelity devices may support up to 100 Hz. 

If the sampling theorem is not respected, aliasing occurs. Frequencies 
above the Nyquist frequency are wrapped around due to the periodicity and 
appear as lower frequencies. Then, these high frequencies are indistinguish- 
able from the true low frequencies. Fig. 2.7 demonstrates this effect visually. 


2.4.3 Noise 


In many cases, acquired measurements or images are corrupted by some un- 
wanted signal components. Common noise sources are quantization and ther- 
mal noise. Additional noise sources occur in the field of medical imaging, due 
to the related image acquisition techniques. 

We can often find a simple model of the noise corrupting the image. The 
model does not represent the physical noise causes, but it approximately 
describes the errors that occur in the final signal. An additive noise model is 
commonly denoted as 
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Figure 2.8: Example of a white noise function 


f(t) = s(t) 4 n() (2.28) 


where s(t) is the underlying desired signal. We observe the signal f(t), which 
is corrupted by the noise n(t). For the statistics of the noise, we can use 
various models e. g., a Gaussian noise distribution p(n(t)) =  (n(t)|ua, X). 
Another property of noise is its temporal or spatial correlation. This can be 
described by correlating the signal with itself, which is called autocorrelation 
function. An extreme case is white noise. White noise is temporally or spa- 
tially uncorrelated, meaning the autocorrelation function is a Dirac impulse. 
The spectrum of white noise is constant, i. e., it contains all frequencies to the 
same amount as a white light source would contain all visible wavelengths 
(cf. Fig. 2.8). 


2.5 Examples 


To conclude this chapter, we want to show the introduced concepts of con- 
volution and Fourier transform on two example systems. A simple system is 
a smoothing filter, that allows only slow changes of the signal. This is called 
a low-pass filter. It is an important building block in many applications, for 
example to remove high-frequency noise from a signal or to remove signal 
parts with high-frequency before down-sampling to avoid aliasing. 

The filter coefficients of a low-pass filter are visualized in Fig. 2.9(a). The 
low-pass filter has a cutoff frequency of ae and a length of 81 coef- 
ficients. The true properties of the low-pass filter are best perceived in the 
frequency domain, as displayed in Fig. 2.9(b). Note that the scale of the y-axis 
is logarithmic. In this context, values of 0 indicate that the signal can pass 
unaltered. Small values indicate that the signal components are damped. In 
this example, high frequencies are suppressed by several orders of magnitude. 
An ideal low-pass filter is a rectangle in the Fourier domain, i.e., all values 
below the cutoff frequency are passed unaltered and all values above are set 
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Figure 2.9: Example of a low-pass filter 
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Figure 2.10: Example of a high-pass filter 


to 0. In our discrete filter, we can only approximate this shape. In the time- 
domain, the coefficients are samples of a sinc function, which is the inverse 
Fourier transform of a rectangular function in Fourier domain (cf. Tab. 2.2). 
The opposite of the low-pass filter is the high-pass filter, shown in Fig. 2.10. 
Here, frequencies below the cutoff frequency are suppressed, whereas frequen- 
cies above are unaltered. Note that the time domain versions of high- and 
low-pass filters are difficult to differentiate. 

Finally, we show how a signal with high and low frequency components 
is transformed after convolution with a high-pass and a low-pass filter. The 
signal in Fig. 2.11 is a sine with additive white noise. Thus, noise is distributed 
equally in the whole frequency domain. A large portion of the noise can be 
removed by suppressing frequency components where no signal is present. 
Consequently, the cutoff frequency of the filters is slightly above the frequency 
of the sine function. As a result, the output of the high-pass filter is similar 
to the noise and the output of the low-pass filter is similar to the sine. In our 
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Figure 2.11: Sine signal with additive noise after processing with a low-pass 
filter and a high-pass filter. 


example, we chose a causal filter which introduces a time delay in the filter 
output. A causal filter can only react to past inputs and needs to collect a 
certain amount of samples before the filtered result appears at the output. 
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In the previous section, we have described common signal processing meth- 
ods that may be applied to any kind of signal. In this chapter, we will adopt 
these concepts to the domain of image processing as it is commonly performed 
in medical imaging devices. 


3.1 Images and Histograms 


Before introducing common methods used in image processing, we first have 
to introduce a representation for images. 


3.1.1 Images as Functions 


In image processing, an image is usually regarded as a function f that maps 
image coordinates x, y to intensity values. This simplifies the introduction of 
derivatives of images which we will later use to detect edges. Furthermore, it is 
useful for the theoretical description of image filtering. T'he image coordinates 
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Figure 3.1: Histograms 


x,y are defined over the discrete image domain (2 C Z?. For gray-value 
images, f(x,y) is a scalar function whereas for color images it is a vector 
consisting of the color channels, e. g., f(x,y) = (fr(z,y) folz, y) fo(z,y)) 
for RGB images. 


3.1.2 Histograms of Images 


Histograms provide information about the distribution of the intensity val- 
ues of an image and are frequently used in image segmentation and in image 
enhancement. A histogram h(i) consists of several bins that contain single 
intensity values or ranges of intensities. For each bin i the number of oc- 
currences n; of the corresponding intensity values in the image are counted. 
'The histogram may either contain these number of occurrences directly or it 
can be normalized that the sum over all bins equals one ( L1-normalization). 
Fig. 3.1 shows an example grayscale image and two histograms with different 
numbers of bins. 

In the context of histograms we can also introduce the cumulative distribu- 
tion function (CDF) cdf(i) = »5; 9 h(j). The CDF sums up the histograms 
entries and can be calculated for regular as well as normalized histograms. 


3.2 Image Enhancement 


For visual inspection, it is often beneficial to change the contrast of an image. 
For example, a computer monitor can only display 8 bit (i.e., 256 different 
values) for each color channel. For an appropriate display of an image with 
a larger color depth, the intensity values therefore have to be scaled to this 
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Figure 3.2: Effect of window and level on a C-arm CT slice: The image 
on the left shows a slice of an animal experiment displaying —1000 to 1000 
HU. In this range, the materials air, soft tissue, contrast agent, and bone are 
displayed. The image on the right shows the same slice in the range from 200 
to 500 HU. Now, only contrast agent and bone are visible. Image courtesy of 
Stanford University. 


range in a meaningful way. The easiest way of doing this is by applying a 
function to the intensity values 


f'(z,y) = 9(f(@,y)). (3.1) 


3.2.1 Window and Level 


In images that have significantly more gray values than 8 bit, semi-automatic 
adjustment of the display using window and level functions is common. In 
CT, the gray values have known physical properties and allow interpretation 
of the material. This effect is diplayed in Fig. 3.2. The image on the left hand 
side displays the Hounsfield unit (HU) from —1000 to 1000 (cf. Tab. 8.1). 
This range covers the materials air, soft tissue, contrast agent, and bone. 
The image on the right shows exactly the same slice. The displayed range is 
now from 200 to 500 HU which shows only contrast agent in the heart and 
cortical bone. Note that the image was obtained from a C-arm system which 
shows significantly lower image quality than conventional CT. 
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Figure 3.3: Gamma correction 


3.2.2 Gamma Correction 


A common choice for g is the use of a power-law, i.e., g(f) = A- f? (The 
constant A is used for normalizing the resulting intensities). This type of 
contrast enhancement is called gamma correction and is adapted to how the 
human eye perceives images. The result of applying a gamma correction on 
an image is shown in Fig. 3.3, for both a value of y smaller and larger than 
1.0. As this transformation is the same for all pixel locations, it is called a 
global transformation. Other types of functions auch as log may also be used 
for g. 


3.2.3 Histogram Equalization 


A different approach to enhance the display of an image is histogram equal- 
ization. Often the intensity values found in an image are restricted to a small 
range of the possible values (i.e., one narrow peak in the histogram). His- 
togram equalization transforms the image such that all intensity values are 
equally distributed in the enhanced image. An equal distribution of intensity 
values is equal to a linear CDF cdfjinear(t) = a- i with the slope a depending 
on the number of pixels in the image. Fig. 3.4 shows an example for histogram 
equalization. 


3.3 Edge Detection 


Edge detection is a common problem in image processing. What we perceive 
as edges in an image are strong changes between neighboring intensities. Since 
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Figure 3.4: Histogram equalization. 


images can be interpreted as functions, we can find these changes by taking 
the derivative of an image. For continuous functions f(x), the derivative is 
defined by the difference quotient 


h>0 h Aa 


An image, however, is defined as a function over discrete coordinates, where 
the above formula is not defined. In this case, a derivative can be calculated 
by using finite differences, an approximation that is similar to the difference 
quotient. The simplest derivatives we can calculate is the forward difference: 


As f(x) = f(x +1) — f(x). (3.3) 


42 3 Image Processing 


(a) First derivative in x- (b) First derivative in y- (c) Second derivative in x- 
direction direction direction 


Figure 3.5: Derivatives of an image. The image intensities are normalized 
for better visualization. A neutral gray indicates a derivative of zero. Darker 
and brighter intensities correspond to negative and positive values of the 
derivative, respectively. (a) and (b) show the first derivative in x and y- 
direction respectively. Note how in (b) horizontal lines are more pronounced 
(e.g., the tip of the hat). In (c) the second derivative in x-direction is shown. 
Here, thin lines (e. g., of the hair) are better visible. 


Comparing the continuous derivative and the forward difference, we can see 
that only the meaning of h has changed. While for the derivative of a con- 
tinuous function, h goes to 0, it is replaced by the constant spacing (usually 
h = 1) in the discrete approximation. 

In contrast to the continuous case, many different approximations can 
be used for the discrete derivative. Along the same line as for the forward 
difference, we can define the backward- and central differences 


Vel (x) = f(x) — f(x — 1) (3.4) 
de f(x) = f(z +1) - f(x — 1) (3.5) 


The difference between these approximations is their applicability at the bor- 
ders of the image, and in the accuracy of their approximation. The central 
difference, for example, is a more accurate approximation. While these three 
approximations are often used in practice, more complicated ones can be con- 
structed. By using more values of the function, i.e., f(a + 2h), f(a + 3h), 
etc., these approximations become more accurate. 

Analogous to the first order derivative, we can also calculate the second 
order derivative using finite differences 


dz f(x) = f(æ +h) - 2f(2) + f(x — h) (3.6) 


Fig. 3.5 shows the derivatives of an example image. 
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Figure 3.6: Construction of the central difference filter from Eq. (3.5). 


3.4 Image Filtering 


Filters play an important role in image processing. In this section, we intro- 
duce filters by using edge detection as a first example. Next, we stress the 
importance of the properties of linearity and shift invariance of filters. 

In the previous section, we introduced the derivatives of images to identify 
edges in an image. However, we did not specify how the derivatives can be 
calculated efficiently for an image. In practice this is done by representing 
the derivatives as discrete filter kernels (see below). 


3.4.1 Filtering — Basics 


Filters can be applied to images in order to process them, e.g., for noise 
reduction. The transformation of the image is determined by the filter kernel 
h(i, j), a rectangular patch of size wy x hp (the point (0,0) is assumed to be at 
the center of the kernel). At each location in the image, the kernel is applied, 
and the resulting value is the new value at the center position of the kernel. 
Fig. 3.6 shows the filter kernel corresponding to the central difference along 
the direction of the x-axis, and how it is constructed from Eq. (3.5). Some 
other possible filter kernels for calculating derivatives are shown in Fig. 3.7. 

In mathematical terms, a filter H is treated as an operator that is applied 
to the input image 


TG, y)) = r(a,y), (3.7) 


where r is the resulting filtered image. 

When filters are linear and shift-invariant (cf. Geek Box 3.1), they can be 
efficiently calculated by convolution. The filtering of an image f(x,y) with 
the filter kernel k(i, 7) can then be expressed as 


Wr hy 
2 2 


Ht f(x, y)} = fxh= f(z—iüy-j)h(j). (3.8) 
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Figure 3.7: The derivative filters corresponding to the finite differences in- 
troduced in Sec. 3.3. The red circles mark the support of the filter. 


Geek Box 3.1: Linearity and Shift-invariance of 2-D Filters 


Analogous to the 1-D operators introduced in Sec. 2.1.2, filters can 
possess some important properties such as shift-invariance and linear- 
ity. For 2-D images, a filter H is linear, when a scaling of the input 
of the filter corresponds to a scaling of the output, and when filtering 
an image that is the sum of two images fı, fo is the same as when 
filtering the two images separately and then adding the results 


Hia: f(r,y)} =a: H{f (x, y); 
H{fi(z, y) + folx,y)} = H{ fila, y)} + H{ fo(z, y)}. 
Filters are shift-invariant, when the filter does not change when we 


shift it over an image, i.e., the filter is independent of the position in 
the image it is applied to 


H{(f(z — zo, y — yo) } = r(x — zo; y — vo) 


For small images and kernels, this expression is usually calculated for each 
pixel of the filtered image individually. For large images, however, we can use 
the property of the Fourier transformation that a convolution in the spatial 
domain is a multiplication in the frequency domain (cf. Geek Box 3.2). 


3.4.2 Linear Shift-invariant Filters in Image Processing 


We now take a look at some linear shift-invariant filters that are often used 
in image processing. 
Average / Mean / Box Filter 


'This is the most basic filter. Each element of the kernel has the same value. 
In order to prevent a change in the range of intensities, the kernel should be 
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Geek Box 3.2: 2-D Fourier Transform 


In the previous chapter, the Fourier transform for 1-D signals (cf. 
Sec. 2.3) was introduced. The transformation can also be defined for 
2-D signals, i. e., images. The Fourier transform F(u, v) and its inverse 
transform for continuous functions f(x,y) is defined by 


= i / Ge da dy 


Waa) = il F(u, pe A pup) dudv. 


As before in the 1-D case, this transform can also be discretized, which 
is needed for transforming images. For an image f of size wy x hy, 
the discrete Fourier transform and the inverse Fourier transform are 
given by 


we—lhs—1 


C= OS) iC ae UIS 
F(u,v) e2ti(ua/wetvy/hs) 


The convolution theorem that was introduced for the 1-D Fourier 
transform also exists for the 2-D case 


Fifxg}=F-G 
F{FxG}=f-G 
DFT(f*g)  F.G 
and allows us to filter an image with a filter kernel by multiplying their 


Fourier transforms and then applying the inverse Fourier transform 
to the result. 


normalized, which leads to the filter kernel shown in Fig. 3.8. The averaging 
(or mean filter or box filter) blurs an image and can be used to reduce noise 
in an image. There are, however, better filter choices for this task, as the 
averaging filter leads to “ringing” artifacts near edges. 
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Figure 3.9: Lena after Gaussian filtering 


Gaussian filter 


A better choice for blurring an image than the averaging filter is the Gaussian 
filter. This filter puts a higher weight to values near its center. The values 
of the kernel can be calculated by the isotropic, zero-mean 2-D Gaussian 
function E 

hli, j = N-e z7 (3.9) 
where N is used to normalize the kernel. The parameter o determines how 
strong the image is blurred. A small value puts the emphasis on the central 
pixels when applying the filter, whereas for a large value the weights of neigh- 
boring pixels are larger. A typical Gaussian filter and the effect of different 
values of c are shown in Fig. 3.9. 
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Figure 3.10: Prewitt and Sobel 
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Figure 3.11: Calculating the median (red circle) of an array. The array is 
first sorted. The median then is the middle value of the array. 


Prewitt and Sobel filter 


'The Prewitt and Sobel filters are combinations of a blurring and a derivative 
filter. In practice, image noise leads to many falsely detected edges, when one 
of the derivative filters in Fig. 3.7 is used. It therefore either makes sense to 
slightly blur the image before the edges are found (not too much, otherwise 
the edges vanish), or to use the Prewitt and Sobel filters. 

'The Prewitt filter corresponds to blurring the image with the averaging 
filter before calculating the derivative. The Sobel filter can be thought of as 
a combination of a Gaussian filter and the central derivative. Example filter 
kernels are shown in Fig. 3.10(a) and Fig. 3.10(b), respectively. 


3.4.3 Nonlinear Filters — the Median Filter 


There exist of course other filters that are not linear and shift-invariant. The 
most often used one is the median filter. The median of a set of numbers 
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(a) Image corrupted by salt-and- (b) Median filtered image (3 x 3 
pepper noise. kernel). 


Figure 3.12: Example for using a median filter to reduce salt-and-pepper 
noise in an image. 


is the value that is at the center of the sorted set (cf. Fig. 3.11) and the 
median filter sorts the intensity values within its kernel and places the median 
value at the center. It is for example used to reduce salt-and-pepper noise 
(sparsely appearing black and white pixels, cf. Fig. 3.12) or in general for 
noise reduction as it preserves edges better than the averaging or Gaussian 
filter. However, since the median filter is not linear and shift-invariant, it 
cannot be applied using convolution and is therefore not as computationally 
efficient. In fact, for each location in the image, the surrounding pixels have 
to be sorted individually to calculate the median value. 


3.5 Morphological Operators 


Morphological operators are operators on sets, so in order to introduce them 
for images, we have to treat images as sets. So far, we have treated images 
as discrete functions. They can, however, also be treated as sets that contain 
tuples (x,y, f(a, y)) of coordinates and intensity values as their elements 


F= { (2o, Yo, f (£0, yo), (z1, yi, f(z1, y1)); CEDE (s Vn; f (Zn; v.))] 
(3.10) 
A morphological operator now consists of a structuring element (which is 
an image/set) and the operation itself. Structuring elements are sets, where 
the values usually have binary values corresponding to back- and foreground: 


S— { (ao, yo), (0)... (3.11) 
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(a) Block (b) Cross (c) Diamond (d) Bar (e) L-shaped 


Figure 3.13: Example structuring elements, foreground pixels are dark 
green, background pixels white. The center is marked by a white circle. 


Depending on the goal, different shapes and sizes are used, of which some 
common ones are shown in Fig. 3.13. As operations themselves, there are 
four basic ones: erosion, dilation, opening, and closing, where the opening 
and closing operations actually are compositions of erosion and dilation. 

Morphological operators are often applied to binary images, i.e., seg- 
mented images, where only foreground and background are distinguished 
(cf. Sec. 3.6). Thus, images that only contain pixels that are marked with 
a specific value, i.e., 


Fom = { (ev) | Jœ) = 1) (3.12) 


The basic idea behind binary morphological operations is, that the struc- 
turing element is shifted over the binary image. The image is then trans- 
formed by the operation, depending on how the structuring element fits to 
the shapes visible in the image, or how it misses them. For all operations, the 
structuring element S has to be shifted over the image and we will denote 
the structuring element at image coordinates (2, y) as 


S(o.y) = {(e+iy+J) | (i,j) € s}. (3.13) 


Binary Erosion 


Using the set notation for images and structuring elements, erosion of an 
image F with the structuring element S can be written as 


Fhin OS = ((z, y) | Seay) € Foin t- (3.14) 


This means that the eroded image contains foreground values only where the 
whole structuring element falls in the foreground region of the image. This 
is illustrated in Fig. 3.14 where a rectangular shape is eroded with a cross 
shaped structuring element. 
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(a) Input image. (b) Structuring element (c) Eroded image. 
overlaid at several posi- 
tions. 


Figure 3.14: An example for binary erosion with a cross shaped structuring 
element. (b) shows the structuring element overlaid at several positions. A 
green structuring element means that the center of the structuring element 
will be foreground in the eroded image, a red structuring element means that 
it will be a background pixel. Note that the structuring element is only green 
when it is completely covered by the block. 


Binary Dilation 


Using the set notation, binary dilation of an image F with the structuring 
element S can be written as 


Pin eS = Tos y) | S(x,y) n Fbin # 0) ; (3.15) 


This means that in the dilated image all pixel locations where the overlap 
between the structuring element and the foreground pixels in the image is 
not empty will be a foreground pixel. This is illustrated with an example in 
Fig. 3.15. 


Binary Opening 


Opening is a composition of erosion and dilation. The image is first eroded 
and the result is then dilated with the same structuring element S 


Fbin 0S = (Fon OS) OS. (3.16) 


This operation can be used to remove single noise pixels, or small extensions 
of larger structures (cf. Fig. 3.16). 
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(a) Input image. (b) Structuring element (c) Dilated image. 
overlaid at several posi- 
tions. 


Figure 3.15: An example for binary dilation with a cross shaped structuring 
element. (b) shows the structuring element overlaid at several positions. A 
green structuring element means that the center of the structuring element 
will be foreground in the eroded image, a red structuring element means that 
is will be a background pixel. Note that the structuring element is green even 
when only a part of it overlaps with the block. 


(a) Input image with (b) Eroded image with (c) Opened image after 
structuring element over- structuring element over- dilating the eroded image. 
laid for the erosion. laid for the dilation. 


Figure 3.16: An example for opening an image with a block shaped struc- 
turing element. The image is first eroded and the result is then dilated. Note 
how the thin extension of the input shape at its bottom is removed by the 
opening operation. 


Binary Closing 


Closing, too, is a composition of dilation and erosion. In contrast to opening, 
the image is, however, first dilated and then eroded with the same structuring 
element S 

Fhin eS = (Fui, $9S)68. (3.17) 


52 3 Image Processing 


(a) Input image with the (b) Dilated image with (c) Closed image after 
structuring element over- structuring element over- eroding the dilated image. 
laid for the dilation. laid for the erosion. 


Figure 3.17: An example for closing an image with a block shaped struc- 
turing element. The image is first dilated and then eroded. Note how the gap 
between the two blocks is filled by the closing operation. 


Closing can be used to fill small gaps in shapes (cf. Fig. 3.17). Note that all 
of all morphological operators can also be applied on grayscale images (cf. 
Geek Box 3.3). 


3.6 Image Segmentation 


In general, image segmentation is the process of converting a grayscale image 
with L different intensity values (e.g., L = 256 for an 8-bit image) into 
an image with Lseg < L gray levels. In the resulting segmented image, the 
different intensity values partition the image into different regions. This can 
for example mean a distinction between foreground and background, bones 
and soft tissue, tumor and healthy tissue or even into several kinds of different 
tissues. In most applications, however, Lseg = 2 and the result is a binary 
image. 

The most basic method for image segmentation is thresholding. For each 
pixel, its intensity value is compared to a threshold 0, and whether it is larger 
than the threshold or not, it is assigned to one or the other class. 


ea t if pey) 2 8 (3.20) 


0 otherwise 


The only question then is how to choose the threshold value 0. Besides trying 
different values for 0 by hand, the histogram of the image often contains useful 
information on how to choose the threshold. There are also some algorithms 
that determine the threshold automatically. 
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Geek Box 3.3: Grayscale Morphological Operators 


Morphological operators can also be defined for grayscale images (and 
therefore also for color images, where each channel is transformed 
independently). For opening and closing the same definition as in the 
binary case can be used. Erosion and dilation, however, have to be 
redefined since in grayscale images there is no binary decision on what 
is foreground and what is background. 


Grayscale Erosion 


Like in the binary case, the structuring element is again shifted over 
the whole image. The eroded image at location (x,y) is now defined 
as the minimum in the overlap of the structuring element S at this 
position and the image F 


(FoS)(z,y)- min f(z +i y+ j). (3.18) 


(4,3) 


Grayscale Dilation 


'The dilation for grayscale images is defined in a similar way as the 
maximum in the overlap of structuring element S shifted to the posi- 
tion (x, y) and the image F 


(F @S)(z,y) = m (3.19) 


(4,3) 


Grayscale Opening / Closing 


The grayscale definitions of opening and closing are the same as 
for the binary case using the grayscale versions of dilation and erosion. 


Input Image Grayscale Erosion Grayscale Dilation 
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Bimodal Histograms — Intersection of Gaussians 


In many cases the histogram of an image where foreground and background 
should be separated is bimodal. The assumption here is that the foreground 
and background both have one characteristic intensity value. Due to shading 
this leads to two peaks in the histogram (cf. Fig. 3.18). The threshold for 
separating the two peaks can for example be determined by assuming that 
the shape of both peaks is Gaussian. In this case, a Gaussian is fit to each 
peak and the threshold is then determined by locating the intersection of 
both Gaussians. 


Bimodal Histograms — Otsu’s Method 


Otsu’s method is based on the statistical analysis of the image’s histogram. 
The method maximizes the inter-class variance o? between the foreground 
and background classes to find the optimal threshold value. To do this, the 
normalized histogram is used. Let us assume that the image f contains L 
different intensity values and is of size wy x hy. Then the normalized histogram 
h consists of the values bys 

h(t) = u (3.21) 

wr: hy 


where n; is the count of intensity i in the image. For a threshold 0, we can 
now calculate the probability of a pixel being classified as background (Class 
1) or foreground (Class 2) 


0 

P,(0) =X h(i) (3.22) 
i=0 

P,(0) =1-—P,(0)= M h (3.23) 


i=0+i 


as a function of the threshold value. We can do the same for the mean values 
of the intensities belonging to each class 


0 
(i 4- 1)h (3.24) 


i 
(i+ 1)h(i). (3.25) 
i=0+1 


. Pa(0) 
Using these quantities, the inter-class variance can be expressed as 


ei (0) = P, (0)P2(0) (us (0) — u2(0))* (3.26) 
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0 64 128 192 


Intensity 


(a) Input image (b) Histogram of the in- (c) Thresholded image 
put image. The thresh- 
old value determined with 
Otsu’s method is marked 
by a red line. 


Figure 3.18: Bimodal histogram thresholding 


So maximizing c? means that the distance between the mean values of fore- 
ground and background is maximized. The optimal threshold 0* is found 
by 
0* =argmaxo;(0), 0 € [0, L). (3.27) 
0 


In practice, the number of values 0 can take is limited to L different values, 
and in order to calculate the maximum of o2, it is sufficient to calculate its 
values for all possible thresholds. In some cases, the maximum is not unique. 
When this is the case, the threshold is determined by averaging over the 
threshold values corresponding to the maximum values of o7. 


Further Reading 


[1] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing 
(3rd Edition). Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006. 
ISBN: 013168728X. 

[2] Richard Szeliski. Computer Vision: Algorithms and Applications. 1st. 
New York, NY, USA: Springer-Verlag New York, Inc., 2010. ISBN: 
9781848829343. 
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This chapter points out the key aspects of minimally invasive surgery 
with particular focus on abdominal surgery using endoscopes. The compari- 
son between minimally invasive and conventional open surgery is illustrated 
and several procedures are detailed. Moreover, this chapter introduces the 
term Image Guidance and its benefits. Finally, this leads to the motiva- 
tion why range imaging is of special interest for the next step of modern 
surgery. As endoscopes are more or less regular cameras that are inserted 
into the body, we also use this section to introduce the pinhole camera in 
Geek Box 4.1, elaborate on fundamental mathematical concepts of projec- 
tion in Geek Box 4.2, and introduce homogeneous coordinates and perspec- 
tive spaces in Geek Box 4.3. Fig. 4.1 displays an endoscopic image taken in 
the sigmoid colon which is the part of the large intestine that is closest to 
the rectum. 


4.1 Minimally Invasive Surgery and Open Surgery 


One criterion to categorize medical interventions is the degree of invasive- 
ness. Therefore, we distinguish between minimally invasive and open surgery 
which is illustrated in Fig. 4.2. As the notation suggests, minimally invasive 
procedures describe medical procedures with little operative trauma. In com- 
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Figure 4.1: View into the colon using an endoscope. Here, the so-called 
sigmoid colon is under investigation for abdominal pain. The inner walls of 
the colon appear healthy without any signs of inflammation. Image data 
courtesy of University Hospital Erlangen, Germany. 


parison to conventional open surgery, this leads to a shorter recovery time for 
the patient and thereby to a reduced hospital stay. In recent years, a variety 
of minimally invasive alternatives to conventional open surgery have evolved 
with special focus on pathologies of the heart and the abdomen. In contrast to 
open surgery, the physician has no direct access to the organs or structures 
of the human body. On the one hand, this means fewer and smaller scars 
and less pain for the patient, but on the other hand, without direct access 
the physician has a limited sense of orientation and usually has to rely on 
additional imaging techniques. For some medical procedures minimally inva- 
sive alternatives are not available as the incision is just too small, e.g., the 
removal of larger organs or transplantations. For smaller organs such as the 
kidney or gallbladder, laparoscopic interventions are already performed as a 
common routine. In terms of operative time, minimally invasive surgery usu- 
ally takes longer due to the smaller incision and worse orientation. Both open 
and in most cases minimally invasive surgery require anesthesia during the 
intervention. Statistical comparison of open and minimally invasive surgery 
in terms of quality-of-life shows an overall improved result for laparoscopic 
interventions. 


4.2 Minimally Invasive Abdominal Surgery 


As one of the most important fields of minimally invasive procedures, the 
diagnosis and treatment of abdominal pathologies is the main medical appli- 
cation of this chapter. Here, a variety of important special instruments are 
required: 
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Geek Box 4.1: Pinhole Camera 


The pinhole camera model is a simple yet powerful model that allows 
us to describe various effects of the projection process. The model 
assumes the the outside world is observed using a pinhole. As such 
the outside world is projected onto a screen on the opposite side of 
the pinhole. Due to the pinhole, the image of the world is a scaled, 


upside-down version of the outside world. Doing so, the model al- 
lows us to describe perspective, i.e., that the size of the projection is 
dependent on the distance. In medical imaging and computer vision 
[2], the pinhole camera model is a widely used assumption. Another 
convention that is also commonly used is that the screen is virtually 
placed outside the camera for graphical simplification. 


Virtual Screen 


e Endoscopes: Depending on the procedure these devices are non-rigid (flex- 
ible endoscopes) or rigid (laparoscopes) and serve as a camera inside the 
human body. Rigid endoscopes have the benefit that the navigation is much 
more intuitive although the degrees of freedom during the navigation is 
reduced compared to non-rigid endoscopes. 

e Trocars: To allow a fast exchange of different instruments a trocar is placed 
in the human body as a port to the abdominal cavity. For different proce- 
dures, different sizes ranging from several milimeters up to a few centime- 
ters are used. 
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Geek Box 4.2: Mathematical Projection Models 


Orthographic 
Weak perspective 
Perspective 


In order to describe camera projections in math, we employ linear 
algebra and matrix calculus. The figure above compares the projection 
of a point zwc = (z,y,z)' onto a virtual screen at coordinate i = 
(iz,iy)' at focal length f (cf. Geek Box 4.1). For simplicity, we neglect 
the y components in the figure. 

The orthographic projection (dashed line) simply neglects the distance 
to the screen and finds the projection as 


()-G9( 

ir 010 M 

Note that this model is not able to describe scaling as it occurs in 
projective modeling. 


In order to alleviate this problem, the weak perspective model (line- 
dotted line) can be employed. Here we introduce a scalar value k 


(2) - C 0 2 : 
by 0k0 : 
that allows us to fix a global, depth independent scaling. 


At this point, we observe that we are not able to find a linear model 
that is able to describe a full perspective projection model (dotted 


line) 
te 
ty 


for any focal length f. 
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Figure 4.2: Illustrations of a cholecystectomy. The left image shows an open 
surgery with direct access and the right image shows a minimally invasive 
approach with endoscopic tools (Courtesy of Prof. Feufiner, Technical Uni- 
versity Munich, Germany). 


e Surgical instruments: For the actual procedure, different endoscopic tools 
are required, e. g., clamps or scissors. Those instruments have a scissor-like 
grip to control the action at the top of the tool. 


The workflow of an endoscopic procedure in general is described by four 
steps. Usually, the patient has to be anesthetized in a first step. The actual 
procedure starts with small incisions where the trocars are inserted. Then, 
the abdomen is insufflated with carbon dioxide gas. This allows the physician 
to have more room for the procedure. Finally, the endoscope and the tools 
are inserted through the trocars to start the actual treatment. 


4.3 Assistance Systems 


Modern surgery companies developed a variety of assistance systems to ease 
the navigation or to reduce the required manpower for minimally invasive pro- 
cedures. Usually, several assistants besides the physician are required during 
the intervention. As the surgeon performs the actual procedure with endo- 
scopic tools, one assistant has to hold the endoscope, one has to hand him 
the required instruments over, one has to supervise the patient and often a 
few more are involved for general organization. The remainder of this section 
introduces some of the available assistance systems in detail. 

A very basic and intuitive assistance system is an endoscope holder. These 
medical devices are available in different complexities, ranging from simple 
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Geek Box 4.3: Homogeneous Coordinates 


In Geek Box 4.2, we have seen that projective transforms cannot be ex- 
pressed by means of linear transforms in a Cartesian space. However, 
there exists a class of mathematical spaces in which this is possible. 
In order to do so, we virtually extend the dimension of our vector 
z' = (z', y)! by an additional component. This way, we create a new 
homogeneous vector 


with A Æ 0. Note that £ can easily be lifted to homogeneous space by 
selecting, e.g. A = 1. For conversion back to a Cartesian space, one 
selects simply the last component of a homogenous vector and divides 
all components by it. Then, the last component can be omitted and 
the point can be mapped back to the original space. This projective 
space has a number of additional properties, e. g., all points are only 
equal (denoted by =), if they map back to the original Cartesian point 
and points with 0 in the last component cannot easily be mapped back 
to the original space as they lie at infinity. A very good introduction 
is given in [2]. 

Using this powerful concept, we can now return to our previous prob- 
lem and observe 


foo 
ofo 
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non-electronic static holders to automatic flexible holders that are navigated 
with a joystick. In general, these endoscope holders allow a more stable image 
acquisition as they exclude any jitter induced by a human. However, those 
systems are often very basic and still have to be navigated by a surgeon, 
e.g., the SOLOASSIST, an electronic assistance arm that is navigated by a 
joystick and simulates a human arm. 

Besides endoscope holders, fully automatic robotic assistance systems are 
already commercially available. These systems allow the surgeon to be at a 
separate workstation as illustrated in Fig. 4.3. All commands are directly 
transmitted to the robot allowing the surgeon to even be at a distant place 
while performing the procedure. One of the most wide-spread is the da Vinci 
system. It is navigated by grips and pedals that enable various degrees of 
freedom. For intuitive visualization, this robot acquires stereoscopic images 
and thereby gives the surgeon a 3-D impression of the scene. 
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Figure 4.3: The da Vinci assistance system (©2014 Intuitive Surgical, Inc) 
with the workstation on the left and the distant robotic system on the right. 


4.4 Range Imaging in Abdominal Surgery 


Although assistance systems for minimally invasive surgery are commercially 
available, one major problem of endoscopic interventions still remains. The 
orientation and thereby the navigation within the human body depends on 
the experience of the surgeon. Due to the lack of intuitive visual comparison 
to the environment, the narrow field of view induces a loss of depth and size 
estimation in the abdominal cavity. However, this information is required for 
diagnosis, e. g., the size of a polyp, and for decision making, e. g., choosing 
the most reasonable endoscopic instrument. Therefore, a variety of different 
approaches to compensate for this loss of information were investigated, e. g., 
adding grids to the endoscope lens or estimating sizes by comparison with 
known instruments. To achieve more accurate estimations, further approaches 
considering range images have been investigated. Although these systems can 
also be utilized to measure sizes or distances, range data acquiring devices 
also enable completely new projects to assist minimally invasive procedures, 
e. g., augmenting 3-D range images with preoperative 3-D CT data. Several 
different techniques have been investigated to acquire endoscopic range im- 
ages. Besides several Shape-from-X approaches, e.g., Shape-from-Shading, 
the three most popular acquisition techniques are using stereo vision setups, 
structured light or time-of-flight ((TOF). Today, only stereo endoscopes are 
commercially available, but the two other techniques are highly investigated 
by different researchers. This section will focus on the working principle of 
available range imaging systems. 
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Figure 4.4: The illustration on the left hand side describes a stereo vision 
setup with two cameras Camera, and Cameras that observe a 3-D point £wc. 
'The projection of this point onto each image plane results in a pixel index 
ių and 29 which can be computed by a simple pinhole camera model. The 
right hand side image shows the front view of a stereo endoscope with two 
apertures to acquire images from two different field of views (close-up view 
courtesy by Prof. Speidel, National Center for Tumor Diseases, Dresden)). 


4.4.1 Stereo Vision 


Stereo endoscopes are the most investigated range image acquiring setups in 
minimally invasive surgery. The principle is also implemented in the da Vinci 
assistance system. Stereo vision describes an intuitive acquisition technique 
that is similar to the human vision and depth estimation. 

'The core concept behind stereo endoscopy is to estimate range information 
by observing a scene from two different perspectives. Given a known baseline, 
the framework has to detect the 2-D projections of a 3-D point in both image 
planes. In theory, using basic trigonometry, the range information of these 
points can then be calculated by triangulation, see Fig. 4.4. In practice, both 
lines will probably not intersect and minimizing the distance of both lines 
will estimate the position of the 3-D point. 

'The requirements for stereo endoscopy are on the one hand a precisely cal- 
ibrated device and on the other hand diversified texture information of the 
observed scene. Accuracy is increased with a wider baseline between both 
sensors. As this baseline is limited by the diameter of the endoscope, the 
improved accuracy has to be gained by calculating the corresponding points 
in both images with higher precision. Corresponding points are calculated by 
detecting features in both images, e. g., by applying the scale-invariant fea- 
ture transform (SIFT) or by computing speeded-up robust features (SURF). 
Matching those feature points results in point pairs that correspond to the 
same 3-D point in the observed scene. Therefore, the output of a stereo en- 
doscope highly depends on the quality of the two images and on the speed 
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and robustness of the feature detection and matching. The estimated range 
images of a stereo setup, also called disparity maps, have a scene dependent 
image resolution. 

The bottleneck of this range image acquisition technique is the calcu- 
lation of corresponding feature points. Hardware manufacturers tackle this 
problem by increasing image resolution in the sensor domain. This leads to 
more details even in almost homogeneous regions in the acquired images, but 
also induces more computational effort to calculate features on both images. 
Therefore, estimating accurate 3-D range data in real-time is a major issue 
in stereo endoscopy. Stereo endoscopes are currently the only commercially 
available and CE certified 3-D endoscopes. 


4.4.2 Structured Light 


Structured light endoscopy is a novel technique based on stereo vision con- 
cepts, but with artificially created feature points instead of those given by 
textural information. In minimally invasive surgery structured light systems 
are not yet commercially available. 

The working principle of structured light sensors is very similar to stereo 
vision systems. In comparison to those, structured light systems do not re- 
quire two cameras observing the scene. The second camera is replaced by 
a projector that generates a known pattern onto the observed scenario, see 
Fig. 4.5. The known baseline and corresponding points in the acquired im- 
age and the known projection pattern are used to reconstruct 3-D points by 
triangulation. As the projector generates the pattern that is used to calcu- 
late the feature points, structured light is also called an active triangulation 
technique. 

Similar to stereo vision, the baseline between the sensor and the projector 
and an accurate calibration is required for high quality measurements. As 
long as the pattern is clearly visible, this technique is independent of texture 
information of the observed scene. In comparison to stereo vision systems, 
the feature detection framework can be highly adapted to the projection pat- 
tern, as the structure of the feature points is known. The projection pattern 
should be easy to detect and hard to disturb by the texture of the scene. In 
conventional structured light setups, stripes or sinusoidal patterns are popu- 
lar. 

As the core concept for structured light is similar to stereo vision, so is 
its bottleneck of detecting and identifying the feature points of the projected 
pattern. Furthermore, the smaller the structures of the projected pattern are, 
the more 3-D points can be reconstructed, but the harder it is to identify those 
structures in the acquired images. 
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Figure 4.5: The illustration on the left hand side describes a structured light 
setup with one camera and one projector that generates a known pattern 
onto the scene. The projection of this point onto the image plane results in 
a pixel index i. Together with the projection pattern the 3-D world £wc 
can be reconstructed using triangulation. The right hand side image shows a 
projection on animal organs. 


4.4.3 Time-of-Flight (TOF) 


TOF technology [4] tackles the topic of 3-D reconstruction from a completely 
different field of view. Instead of using acquired color images to find any 
distinctive structures, reflection characteristics are exploited to physically 
measure the distances of the observed scene. The first work to introduce this 
concept is found in [6]. 

The concept behind TOF technology is to measure a frequency modulated 
light ray that is sent out by an illumination unit and received by a TOF sen- 
sor, see Fig. 4.6. The received sinusoidal signal is sampled at four timestamps 
to estimate the phase shift @ between the emitted and the received signal. 
'The radial distance d is then computed by: 


| € p 
i 2 fmod Qn’ 


where c denotes the speed of light and fmoq the modulation frequency. 

As the illumination unit can be realized by an light-emitting diode (LED) 
and the sensor is a simple CMOS or CCD chip, production costs of TOF 
sensors are rather low. However, due to their novelty compared to stereo vi- 
sion, current TOF devices exhibit low data quality and low image resolution. 
Besides the range image, most TOF devices provide additional data, e.g., 
photometric data, often denoted as the amplitude image and a binary va- 
lidity mask. Due to its measurement technique, TOF setups do not require 


d (4.1) 
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Figure 4.6: The left hand side illustration describes a TOF setup with a 
single camera and an illumination unit that sends out the modulated light. 
The reflected signal is then received by the TOF sensor at pixel index i. After 
calculating the phase shift, the radial distance is computed. The right hand 
side image shows a prototype. The setup includes two separated light sources 
for the color and the TOF acquisition. 


a baseline between the illumination unit and the measuring sensor, which is 
beneficial for the use in minimally invasive surgery. 

Besides the low resolution, systematic errors reduce the data quality ex- 
tremely. The error sources range from color dependent range measurements 
over temperature issues of the devices to flying pixels at object boundaries. 
In minimally invasive procedures, two major issues occur. First, multiple re- 
flection within the abdominal cavity corrupts TOF measurements. Second, 
inhomogeneous illumination caused by the endoscopic optic hinders accurate 
range measurements. Still the technology is real-time capable and thereby 
also suited for live broadcasting [9]. 
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We perceive the physical world around us using our eyes, but only down to 
a certain limit. Objects with a diameter smaller than 75 um cannot be recog- 
nized by the naked eye, and due to this reason, they remained undiscovered 
for the most of human history. Entities which belong to this category include 
cells (diameter of 10 um), bacteria (1 um), viruses (100 nm), molecules (2 
nm), and atoms (0.3 nm)!. In fact, the importance of these micro/nano en- 
tities in almost every aspect of our life cannot be sufficiently appreciated. 
Microscopes are the tools which enable us to extend our vision to the micro- 
world and, despite the prefix micro- in the name, to the nano-world, too. This 
chapter takes the reader through the basic principles of the most widely-used 
light microscopy techniques, their advantages, and their inherent limitations. 
Further microscope types such as scanning tunneling microscopes or atomic 
force microscopes are beyond the focus of this text. In contrast to the pre- 
vious chapter, a pinhole projection model is no longer sufficient to explain 


1 The diameter measurements given here are for a blood cell, a typical bacterium, an 
influenza virus, a DNA molecule, and a uranium atom. 
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Figure 5.1: Image formation in a converging lens for an object whose dis- 
tance to the lens is larger than the focal length. 


microscopy. Therefore, we introduce the thin lens model as it provides expla- 
nations for at least two functionalities: light-gathering and magnification. 


5.1 Image Formation in a Thin Lens 


Consider an object with height h standing at a distance d in front of a con- 
verging lens with a focal length f < d. Naturally, the lens creates an image 
of this object. The question then arises as how we can determine the height 
of the image h’ and its distance d' to the lens. From a geometrical optics 
perspective, the image formation process can be described using three simple 
rules (cf. Figure 5.1): 


1. An incident light ray which passes through the optical center O does not 
suffer any refraction. 

2. An incident light ray parallel to the optical axis is refracted passing 
through the image focal point F". 

3. An incident light ray which passes through the object focal point F is 
refracted parallel to the optical axis. 


As shown in Figure 5.1, the three rays intersect at a point positioned at 
distance d' from the lens. Obviously, two rays are sufficient to geometrically 
construct this intersection point. The image acquired at d' is defined as an in- 
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focus image. On the other hand, an image acquired at a longer or a shorter 
distance than d', is called defocused image. In this context, an image of a 
point source (such as T in Figure 5.1) is infinitely small at focus (abstracted 
as a point 7" in Figure 5.1), but it is larger than a point for defocused images. 

Figure 5.2 shows the result of applying the rules of image formation, i.e., 
the three rules mentioned above, on the case when the object is within the 
focal length (d « f). As can be seen in the figure, the rays do not converge. 
However, the ray extensions intersect at a point T", called virtual image, 
from which the rays appear to diverge. In contrast, the images formed when 
d > f are called real as they are real convergence points of light rays. Virtual 
images formed by a converging lens are upright while the real images are 
upside-down. Another important difference is that virtual images cannot be 
projected on a screen, a camera chip, or any other surface. Nevertheless, they 
can be perceived by the human eye because the eye behaves as a converging 
lens which recollects the diverged light rays on the retina. 

Figure 5.3 shows the result of applying the rules of image formation in a 
diverging lens when d « f. It should be noted, however, that: Contrary to 
the case of converging lenses, when applying these rules on diverging lenses, 
the image focus F” is at the side of incident light rays and the object focus 
F is at the other side of the lens. Similar to the case described in Figure 5.2, 
the image is upright and virtual. However, in contrast to Figure 5.2, it is 
demagnified. We obtain this result with a diverging lens when d > f as well. 

So far, we could geometrically construct the image of an object in a di- 
verging or a converging lens. At this point, we may ask whether there are 
closed-form equations which relate the object height h to the image height 
h’, or the object-lens distance d to the image-lens distance d’. 

Let us consider a converging lens with d > f (cf. Figure 5.1). From the 
similar triangles TOB and T'OB', one can directly write: 


h^ d 
'The same applies for triangles T'F B and FOL: 
h/ f 
Combining Eq. (5.1) and Eq. (5.2) yields: 
BE oe 
d-f d 
fd —d'd—d'f 
fd+df=dd 


Dividing by fdd’ yields the thin lens equation: 
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Figure 5.2: Image formation in a converging lens for an object whose dis- 
tance to the lens is smaller than the focal length. 
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Figure 5.3: Image formation in a diverging lens. 
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mde (5.3) 


Eq. (5.3) was derived in this text for real images in a converging lens. Never- 
theless, it can be also used for virtual images and/or diverging lenses under 
the following sign conventions: 1) d' is negative when the image is at the ob- 
ject side of the lens (similar to the case in Figure 5.2), otherwise it is positive. 
2) f is negative for diverging lenses. Moreover, if we add a third sign con- 
vention stating that h’ is positive for upright images and negative otherwise, 
then Eq. (5.1) and Eq. (5.2) can be generalized to the following form: 


hi f. d 
h | d-f d 


(5.4) 


Based on the above-mentioned sign conventions, the magnification M is posi- 
tive for upright images and negative for upside-down images. This generaliza- 
tion, i. e., Eq. (5.3) and Eq. (5.4), can be proved to be correct by applying the 
three rules of geometric image formation and employing triangle similarity for 
each specific setup. Moreover, based on Eq. (5.4), the following conclusions 
can be drawn: 


e The image of an object in a converging lens is magnified (|M| > 1) when 
d « 2f, has the same size of the object when d = 2f, and demagnified 
(|M] < 1) when d > 2f. 

e The image of an object in a diverging lens (f < 0) is demagnified. 


5.2 Compound Microscope 


If you look through a magnifying glass at an object located within the focal 
length of the lens, you see a magnified upright virtual image of the object. 
Conceptually, this is a simple microscope. The compound microscope (cf. 
Figure 5.4) extends this basic principle by using at least two converging 
lenses. The lens which is closer to the specimen is called objective lens. It 
creates a real magnified inverted image G, of the specimen. This requires 
that the specimen distance to the objective do is in the range fo < do < 2fo, 
where f, is the focal length of the objective. The second lens is called eyepiece 
as it is the component through which a user of the microscope observes the 
sample. The distance of G, to the eyepiece de is, by construction, less than 
the focal length of the eyepiece (de < fe). Consequently, the eyepiece lens 
creates a magnified virtual image Ge of Go. Since the image of the first lens 
is an object for the second one, the total magnification is the product of the 
two lens magnifications. 

In modern microscopes, the objective lens is characterized by its mag- 
nification and numerical aperture. The magnification was defined above in 
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fo < do < 2fo 


ate Eyepiece 


Figure 5.4: Image formation in a compound microscope. Symbols Fp, F7, Fe, 
and F7 represent the objective object focal point, objective image focal point, 
eyepiece object focal point, and eyepiece image focal point, respectively. A 
human observer at the right-hand side of the figure will see the image Ge. 


f f 


< 4 


Figure 5.5: The numerical aperture is determined by 0 the half angle of the 
maximum light cone and n the refractive index of the medium between lens 
and specimen. 
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(a) Scheme of a bright field microscope. (b) Scheme of a fluorescence microscope. 


Figure 5.6: Basic diagrams of a bright field microscope and a fluorescence 
microscope. Both were drawn after [1]. 


Eq. (5.4). The numerical aperture quantifies the capability of a lens to gather 
light. It is defined as follows: 


NA =n sin 0, (5.5) 


where n is the refractive index of the medium between objective lens and 
specimen (Nair © 1) and @ is the half angle of the maximum light cone which 
the lens can collect (cf. Figure 5.5). Since the image formed by the objective 
lens is real, it can be captured by a physical detector. For instance, it can be 
recorded by a CCD chip, and hence, the magnified view can be saved as a 
digital image which can be further processed by a digital computer. 

'The principle of compound microscope models the magnification mecha- 
nism. Additionally, depending on how the sample is illuminated and which 
kind of information is carried by light rays, light microscopes can be further 
classified into subcategories: bright field, fluorescence, phase contrast, quanti- 
tative phase, and others. In the following sections, more details will be given 
about each of the aforementioned microscopic modalities. 


5.3 Bright Field Microscopy 


'Typically, the density and thickness of a specimen are space-variant. Con- 
sequently, specimen points absorb light differently, i.e., the energy of light 
after passing through the specimen is, likewise, space-variant. Figure 5.6(a) 
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Figure 5.7: A microscopic image of a cell culture: The image was acquired 
using a Nikon Eclipse TE2000U microscope with a bright field objective of 
magnification 10x and NA = 0.3. 
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(a) A bright field image of Chinese hamster (b) The same scene at the left-hand side 
ovary (CHO) cells. but seen under a fluorescent channel. Red 
spots indicate dead cells. 


Figure 5.8: Illustration of cell viability detection using PI-staining. 


schematically shows how this fact can be utilized in a microscopic setup. The 
condenser shown in the figure plays the role of concentrating light coming 
from a light source at the specimen. The specimen information is encoded in 
the intensity of light wave which reaches the objective. Background or the 
part of the scene which does not contain dense objects tends to be bright in 
the resulting image. This observer impression gave the technique its name. 
Bright field setup is the number-one choice whenever minimization of ex- 
penditure or implementation difficulties are main concerns. Àn example of a 
bright field image of cells is shown in Figure 5.7. 

In clinical routine cells in suspension are only investigated infrequently. In- 
stead, the most common investigation techniques for bright field microscopy 
are cytology, where cells and their inner structure are investigated, and his- 
tology, where the embedding of cells into the surrounding tissue architecture 
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Geek Box 5.1: Stains for Histology and Cytology 


To highlight cellular structures, sections from tissue biopsies and also 
cytology slides are often dyed or stained. The most common form of 
stain in histology is a mixture of two substances called hematoxylin 
and eosin, where the hematoxylin color stains cell nuclei blue and cy- 
toplasm and other cellular structures are dyed in magenta by eosin. 
Dyes are furthermore used to assess the amount of certain substances, 
e. g. copper or iron, or biologic structures adhering to certain biomark- 
ers. Besides a main color, often a secondary (or even third) color with 
strongly different spectral shape is used to dye other cellular compart- 
ments and enhance the contrast, a process called counterstaining. 

In order to prepare a sample, it usually undergoes the process of fixa- 
tion with formaldehyde and embedding in paraffin wax. The fixation 
stops a great part of the biologic processes and ensures a proper qual- 
ity of the slide and a slow degradation process. Embedding in a block 
of wax is a precondition to cutting thin slices of constant thickness, 
which are then placed on a microscope slide and covered with a cov- 
erslip. 


Different stains for histology and cytology. 
Top row: Hematoxylin-eosin, Azan, Multi-cytokeratin (AE1, AE3). 
Bottom row: Grocott, May-Grünwald-Giemsa, Turnbull Blue. 
Images courtesy of FU Berlin, Germany. 


is described. For both techniques, staining of the sample plays an important 
role (see Geek Box on page 77). 
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5.4 Fluorescence Microscopy 


While a bright field microscope utilizes light absorption of a sample, a fluo- 
rescence microscope makes use of another natural phenomenon called, unsur- 
prisingly, fluorescence. Some special materials, when illuminated with light 
having a specific wavelength, emit light with another wavelength. As shown 
in Figure 5.6(b), an excitation filter is required to select a part of the elec- 
tromagnetic spectrum for exciting the fluorescent materials in specimen. An- 
other filter is then utilized to separate the emitted light from that used in 
the excitation process. 

Fluorescence microscopes deliver images of high contrast when compared 
to bright field images. In addition, due to the fact that fluorescence can be 
incited by specific biological or physical processes, scientists were able to find 
many applications of fluorescence microscopy in materials science and cellular 
biology. To give just one example, a widely-used technique for cell viability 
detection (cf. Figure 5.8) is based on imaging of a fluorescent dye called 
propidium iodide (PI). Viable cells are usually selectively permeable, i. e., 
they do not allow molecules to freely cross the cellular membrane. When a 
cell dies, this exclusion property is lost allowing PI to leak through the cellular 
membrane toward cell interior. PI binds then to RNA and DNA inside the 
penetrated cell which drastically enhances the fluorescence. Therefore, dead 
cells can be easily distinguished from the non-stained viable cells. 

There are at least two shortcomings of fluorescence imaging: Firstly, stain- 
ing may cause some undesired effects on the sample under study. For instance, 
it was shown that the dyes used in cell viability detection affect cell stiffness. 
Secondly, what we see under fluorescence microscopy is the activity of fluores- 
cent dyes which, in general, does not reveal structural information. Moreover, 
these fluorescent dyes do not always cover the entire imaged object. These 
two factors lead to incomplete shape information. For confocal laser endomi- 
croscopy, also fluorescent dyes are employed, yet in a different setup which is 
discussed in Geek Box 5.2. 


5.5 Phase Contrast Microscopy 


As mentioned earlier, in bright field microscopy, light absorption is respon- 
sible for image formation. Objects which absorb light are called amplitude 
objects since they affect light amplitude. Transparent objects, on the other 
hand, hardly alter the amplitude of light. They, however, delay light wave in- 
troducing a phase shift, and thus, they are given the name phase objects. We 
demonstrate this effect visually in Figure 5.10 and introduce the underlying 
math in Geek Box 5.3. 
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Geek Box 5.2: Confocal Laser Endomicroscopy 


Recently, a novel method of fluorescence microscopy imaging has 
gained attention in research: In Confocal Laser Endomicroscopy 
(CLE), a fiber bundle carrying laser light in the cyan color spectrum 
is inserted into cavities of the human body, usually through the acces- 
sory channel of a normal endoscope. With high magnification ratios, 
it is being used for structural tissue analysis in vivo, i.e. in the living 
patient. Due to the confocal construction, a single focal plane in a 
defined depth can be visualized as a sharp image since the image is 
not tainted by scattering light. Prior to the examination, a fluores- 
cent contrast agent is given to the patient intravenously, enriching in 
the intercellular space and thus making outlining cellular structures 
possible. 


X scanner dichroic filter 
y 2 SSS photodetector 


y scanner fiber bundle 


Confocal Laser Endomicroscopy (adapted from [15]) 


CLE generates video sequences at rates of up to 12Hz [15] and is 
clinically used for diagnosis within the gastro-intestinal tract [13]. But 
its application is not limited there: In the field of neurosurgery, it was 
shown that a discrimination of brain tumors can be performed on CLE 
images [9], and it was also successfully used for diagnosis of tumors in 
the mouth and the upper airways [21, 10]. 


Ps gi. 
CLE Image of healthy epithelial tissue of the vocal folds (left) and 
with squamous cell carcinoma (right). Images courtesy of University 
Hospital Erlangen, Germany. 
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(a) A bright field image (b) A bright field image (c) A phase contrast im- 

dominated by amplitude ob- dominated by phase objects: age of the same scene shown 

jects: CHO cells in suspen- adherent ultra-thin CHO in 5.9(b). In comparison to 

sion. cells. 5.9(b), cells are clearly vis- 
ible, albeit surrounded by 
halo artifacts. 


Figure 5.9: Examples of amplitude objects and phase objects in biology. 


Typical light detectors such as CCD chips or retina in our eyes can recog- 
nize amplitude variations but they are insensitive to phase distortion. In the 
1930s, the Dutch physicist Frits Zernike came up with a brilliant trick for con- 
verting the invisible phase shift to a visible amplitude change using an optical 
filter. His contribution is the basis for a long-established technique in labora- 
tories today known as phase contrast. Figure 5.9(a) shows a bright field image 
of a sample dominated by amplitude objects. In this particular example, they 
are cells in suspension. Figure 5.9(b) also shows a bright field image, but of 
a sample dominated by phase objects. The sample contains ultra-thin adher- 
ent cells. In Figure 5.9(c), the same specimen of Figure 5.9(b) is shown, but 
under a phase contrast microscope. A considerable improvement in contrast 
and information content can be clearly seen in the phase contrast image. 


5.6 Quantitative Phase Microscopy 


In the previous section, phase was employed to obtain more contrast of trans- 
parent specimens. At this point, we may ask the following question: what does 
the numerical phase value tell us about the physical properties of a specimen? 
As discussed in Geek Box 5.4, we only observe the difference of the phases of 
two waves and are unable to observe an absolute value. 

Phase contrast (cf. Section 5.5) is convenient for qualitative unstained 
imaging of transparent specimens. However, it is not suitable for obtain- 
ing quantitative phase values for two reasons: Firstly, phase information is 
perturbed by artifacts, called phase halos, in image regions which surround 
phase objects (cf. Figure 5.9(c)). Secondly, Zernike’s approach which links an 
observed intensity value to the corresponding phase value is valid only for 
very small phase shifts. 
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Geek Box 5.3: Wave Equation 


Informally speaking, at a point in space r = (x,y,z), we can imagine 
the light activity as a particle dancing in time according to e/^*, where 
t is time and w = 276 is the angular frequency which determines 
light color. In general, this dance is amplitude-scaled and phase-shifted 
differently at each point in space. Consequently, the wave/particle 
function v(r,t) can be modeled as follows: 


plr, t) = A(r)e'(»t* = Afrit ewt — U(r)e™*. (5.6) 


The term U(r) encodes both amplitude change A(r) and phase shift 
$(r) as a complex number, and is thus called complex amplitude of 
the wave. Eq. (5.6) is not sufficient to describe a wave unless ~ fulfills 
the celebrated wave equation: 


82 
= ° V, (5.7) 


where c is the speed of light in the propagation medium, and V? = 


E + is + E is the spatial Laplacian. Assuming that ~ can be 
factorized as v(r,t) = Yr(r)y(t) (which is the case in Eq. (5.6)), 
one can derive the time-independent wave equation, also known as 
Helmholtz’s equation: 


V?U (r) + k*U(r) = 0, (5.8) 


where k is defined as k = © and called wavenumber. An important 
class of solutions for Helmholtz’s equation is given by the following 
complex amplitude: 

Ur) = Aye! * *. (5.9) 


In this solution, the amplitude is constant everywhere with a 
real value A; whereas the phase is linearly dependent on posi- 
tion ġe = k'r = rk, + yk, + zk,. In order for Eq. (5.9) to satisfy 
Helmholtz's equation, k must fulfill ,/ k2 + k? + k? = k. This fact can 
be verified by setting U(r) = U;(r) in Eq. (5.8). Moreover, the locus 
of points in space for which U;(r) = constant, is a plane with nor- 


mal vector k. Therefore, waves described by Eq. (5.9) are called plane 
waves. 
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a) Wave 


b) Amplitude Object 
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c) Phase Object 
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Figure 5.10: If we consider light as a wave with amplitude A and wavelength 
A, we observe that amplitude objects reduce the wave amplitude by absorp- 
tion. Phase objects cause a phase shift due to differences in the refractive 
index inside and outside the object. 


Quantitative phase microscopy is an umbrella term for a set of techniques 
by which it is possible to obtain reliable quantitative phase information. Geek 
Box 5.5 discusses one of the methods to determine quantitative phase in de- 
tail: the transport of intensity equation (TIE). Due to the quantitative nature 
of TIE results, it can be utilized to compute specimen physical descriptors 
which are difficult to obtain using phase contrast. For instance, it can in 
principle be used for estimating cell thickness and volume in biological cell 
cultures. In general, the TIE seems to be attractive when compared to phase 
contrast for at least two reasons: 1) It is possible to obtain high-contrast 
phase images using a bright field microscope which is cheap and easy to 
implement compared to a phase contrast microscope. 2) TIE yields quantita- 
tive rather than qualitative phase information. However, every new technique 
comes with its own problems, and TIE is by no means an exception to this 
rule. In fact, estimating the axial derivative is very sensitive to the selection 
of defocus distance Az. In addition, a TIE solution is prone to be perturbed 
by a low-frequency bias field which needs to be corrected. 
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Geek Box 5.4: Phase Shift 


In fact, the phase shift introduced by a phase object can be given as 


follows: 
z2(z.y) 


Qaig(z, y) =k An (a, y, z) dz, (5.10) 
21(2,y) 
where k is the wavenumber of the incident light, An (x,y,z) is the 
difference in the refractive index between the object and surrounding 
medium, z; and 22 are the start and end coordinates of the light path 
through the object. If the object has a homogeneous refractive index, 
Eq. (5.10) reduces to: 


er (x,y) = k- An qu). (5.11) 


where q (x,y) is the object thickness at (x, y). The product of refrac- 
tive index with the geometric length of light path is usually termed 
optical path length. In addition, the difference of two optical path 
lengths is called optical path difference. Therefore, the numerical value 
of phase is interpreted as optical path difference between the object 
and the surrounding medium. The constant k is typically ignored. 


5.7 Limitation of Light Microscopy 


In Figure 5.1, a point source creates a point image at focus. This is, however, 
a result of geometrical optics which does not take the wave nature of light 
into account. From a wave-optics perspective, light exhibits the properties of 
waves, and hence, it undergoes diffraction upon encountering a barrier or a 
slit. In microscopy, this slit is the finite-sized aperture of the objective. Due to 
the diffraction process, the image of a point source is a pattern known, after 
Sir George Airy, as Airy pattern. As shown in Figure 5.12(a), it is composed of 
a central spot, known as Airy disk (in 2-D), surrounded by multiple diffraction 
rings. The radius of an Airy pattern, when the image is in its best focus, is: 


À 

dairy = 0.6154) (5.14) 
where \ = t is the wavelength of incident light. It is noteworthy to mention 
that dairy in Eq. (5.14) is given in object-space units. Therefore, in image 
plane, the radius of the Airy disk is M - dairy, where M is the magnification. 
'The resolving power of a microscopic system is defined as the minimum 
distance between two point sources in the object space for which they are still 
discernible as two points in the image plane. Intuitively, the two points are 
distinguishable as long as the sum of the two corresponding Airy patterns con- 
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Geek Box 5.5: Transport of Intensity Equation (TIE) 


Teague derived the TIE in 1983 starting from Helmholtz’s equation 
(cf. Eq. (5.8)) under the approximation of a slowly varying field along 
the z-axis: 

- SUEY) L 5 (x,y) vi (os) + Vil Gs) Vid (a9) » (6-12) 
where I (x,y) is the at-focus intensity image (related to the complex 
amplitude in Eq. (5.8) by I = |U|^), and V , is the gradient operator 
in the lateral directions, i.e., in the zy plane. The symbol ¢ denotes 
the phase difference (cf. Eq. (5.10)), but ¢ was used instead of óaig as 
the phase appears only in differential terms in the TIE. In other words, 
the phase in TTE is defined up to an additive constant which makes no 
difference between ¢ and aig. This equation can be further simplified 
if we assume ideal phase objects, i.e., I (x,y) = constant = Io, to the 
following form: 

k OI (x,y) 


522 = Ii 6n): (5.13) 


The axial derivative at the left-hand side of Eq. (5.12) or Eq. (5.13) can 
be measured: First, acquire a bright field image at focus Ip. Defocus 
the microscope by a distance Az and acquire another image I(Az): 


I(-Az) Ip  I(Az) 


The finite-difference approximation of the derivative is then given 
by Haden, After estimating the axial derivative, the only unknown 
which is left in the TIE is the phase. Therefore, the TIE can be solved 
for @ yielding a quantitative phase map. 

Earlier in this text, it was mentioned that ideal phase objects are invis- 
ible in bright field microscopy. In fact, as demonstrated in Figure 5.11, 
the aforementioned statement is correct only under the condition that 
the image is acquired at focus. This phenomenon, i. e., the possibility 
to visualize phase objects in bright field microscopy, can be interpreted 
in the light of the TIE. The contrast obtained by defocusing is numer- 
ically represented by the left-hand side of Eq. (5.13). The right-hand 
side reveals that this contrast is, in fact, phase information. The em- 
ployment of defocusing to visualize transparent samples in a bright 
field setup is sometimes called defocusing microscopy. 
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(a) A defocused bright field image (b) A bright field image of the cell 
of the cell culture: Az = —15 um. culture at-focus: Az = 0. 


(c) A defocused bright field image (d) A quantitative phase map ob- 

of the cell culture: Az = +15 um. tained by solving the TIE. The bias 
field was partially corrected using a 
bias-correction algorithm. 


Figure 5.11: Illustration of quantitative phase microscopy using the TIE. 
The figures show a cell culture of adherent ultra-thin L929 cells. 


tains two distinct peaks. However, the condition under which the two peaks 
are considered distinct, can be defined in several ways. This led to different, 
but similar, definitions of the resolving power. According to Rayleigh, it is 
given by the radius of Airy disk dmin = dairy (cf. Figure 5.12(b)). A slightly 
different definition, known as Abbe criterion, is given as dmin = 0.54. 

In order to enhance microscopic resolution, one needs to employ light of 
shorter wavelength and/or an objective of higher numerical aperture. Using 
shorter wavelengths will be considered in the next section. The numerical 
aperture, as revealed by Eq. (5.5), is theoretically upper-limited by unity 
when air (nai, © 1) is the medium between the specimen and the objective. In 
order to go beyond this limit, microscope manufacturers designed objectives 
which can function when a medium of higher refractive index such as water 
(Nwater © 1.33) or oil (nog & 1.51) is embedded between the specimen and 
the objective. This led to the development of water immersion objectives and 
oil immersion objectives. 

If we set the wavelength in Eq. (5.14) to the wavelength at the center of the 
visible spectrum Avisible © 550 nm and numerical aperture to the theoretical 
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+ Intensity Intensity 
4 dmin 
Space Space 
— 
dairy 


(a) Airy pattern composed of Airy peak (b) Rayleigh criterion: Two features with 
with radius dairy surrounded by diffrac- distance less than dmin = dairy will be re- 
tion rings. solved as a single feature. 


Figure 5.12: Diffraction barrier: Due to diffraction, the image of a point 
source is an Airy pattern. The resolving power din of a microscope is thus 
limited by the width of this pattern. 


upper-bound of oil-immersion numerical apertures NAP** = 1.51, we obtain 
a Rayleigh resolution of d>sst = 222 nm zz 0.2 um. This value? is often cited 
as the resolution limit of optical microscopy. Two distinct points in object 
space with a distance less than 0.2 wm will be imaged as a sum of two Airy 
patterns in which only one distinct peak can be recognized. Increasing the 
magnification will increase the size of this sum of Airy patterns at the image 
plane, but the enlarged image remains a single-peak pattern. In other words, 
beyond a certain limit, increasing the magnification does not resolve new 
details. This phenomenon is known as empty magnification. 


5.8 Beyond Light Microscopy 


One obvious way of increasing microscopic resolution is using a wavelength 
which is shorter than the wavelength of visible light. For instance, it is possible 
to employ ultraviolet (UV) radiation (wavelength in range 300 — 100 nm), soft 


? Or other close approximations of it depending on the considered upper-limit of nu- 
merical aperture and definition of resolving power. 
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X-ray (10 — 1 nm), hard X-ray (below 1 nm)’, or electron beams (wavelengths 
below 5 pm are achievable). Each wavelength range allows us to explore a 
part of the nano-world, but also imposes a new type of challenges for both 
microscope manufacturers and users. At the UV wavelengths, glass strongly 
absorbs light radiation, and thus, in UV microscopy, the lenses are made of 
UV-transparent materials such as quartz. Moreover, at the wavelengths of 
X-ray radiation, the refractive index of solid substances is very close to the 
refractive index of air. Since the light-focusing performed by a visible-light 
lens is inherently a refraction process, these lenses cannot be used to focus 
X-ray beams. In fact, in X-ray microscopy, expensive and impractical devices 
which are based on diffraction instead of refraction are employed to replace 
the typical optical lenses. Electron microscopy utilizes electromagnetic lenses 
and cathode rays in order to achieve a drastic improvement in resolution 
compared to light microscopy. Unlike ultraviolet and X-ray radiation, cathode 
rays, being electron beams of measurable mass and negative charge, do not 
belong to the electromagnetic radiation. Therefore, the photon-wave duality, 
and hence the conception of wavelength, are not directly applicable. One of 
the major contributions which led to the development of electron microscopy 
is the theory of Louis de Broglie who stated in his PhD thesis that the 
particle-wave duality is also valid for matter. According to de Broglie, the 
wavelength of an electron of mass m, and speed ce is given by: 


p 


? 
Me Ce 


Àe = 


(5.15) 


where p is Planck constant. As an alternative for reflection in optical lenses, 
in electromagnetic lenses, deflection of electron beams by magnetic fields was 
exploited to focus the beams. In an electron microscope, similar to a cathode- 
ray tube, an electron beam is emitted into vacuum by heating the cathode 
and then accelerated by applying a voltage between the cathode and the an- 
ode. The speed of the electrons, and hence the wavelength (cf. Eq. (5.15)), 
can be controlled by varying the voltage. The first electron microscopes were 
very similar from a schematic point of view to bright field microscopes. The 
acquired image is based on the specimen absorption of electrons when trans- 
mitted into the sample, and hence, they were given the name transmission 
electron microscopes. A resolution as high as 0.2 nm is achieved by the trans- 
mission electron microscopes. A major limitation of this scheme, however, 
is that only very thin samples can be imaged. Scanning electron microscopy 
was developed to cope with this difficulty. To do so, a primary electron beam 
is focused by an electromagnetic lens on a very small part of the specimen. 
This primary beam incites the emission of a secondary electron beam. The 
intensity of this secondary beam is recorded. Afterwards, the primary beam 
is moved to another part of the specimen, and the same process is applied. 


3 X-ray and UV radiation, being a part of the electromagnetic spectrum, belong to 
invisible light. The term light microscopy is, however, restricted to visible light in this 
text. 
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This is repeated so that the entire specimen is scanned in a raster pattern and 
the final image is obtained from the recorded values of the secondary beam 
intensities. Scanning electron microscopy can be used to image thick samples, 
even though it captures only the surface details. In addition, the secondary 
beam is accompanied with X-ray emission characteristic to the material which 
emitted it. Therefore, it is employed to reveal the chemical composition of 
specimens. Both scanning electron microscopy and transmission electron mi- 
croscopes work in a vacuum. Consequently, they can be used only for dead 
specimens. From this perspective, X-ray and traditional light microscopy are 
preferred over electron microscopy. Although X-ray and electron microscopes 
provide a considerable improvement of resolution over light microscopes, they 
are extremely expensive, require large hardware, and mostly involve compli- 
cated sample preparation. 


5.9 Light Microscopy Beyond the Diffraction Limit 


In the past few years, the so-called superresolution microscopy became an 
active research trend. Today, based on this technology, there are microscopes 
which achieve a resolving power of about 10 nm. While this number is inferior 
to electron microscopy resolution, the breakthrough lies in the fact that this 
is achieved using visible light. As stated earlier in this text (cf. Section 5.7), 
the attainable resolution using visible light is limited to 200 nm. May we then 
conclude that the theory which led to the diffraction limit in light microscopy 
is flawed? In fact, superresolution microscopy is based on alternatively turn- 
ing fluorescent molecules in a specimen on and off. Two adjacent fluorescent 
molecules with a distance less than 200 nm will not be resolved as two points 
in a superresolution microscope when both of them are turned on simultane- 
ously. However, this will be the case, i.e., they will be resolved as two points, 
if only one of them is activated at a specific time, and in addition, there is 
a mechanism to control this activation process. Superresolution microscopy 
techniques differ in the way in which this on/off switching is implemented. 
Major technologies in this field today include: stimulated emission depletion 
(STED), reversible saturable optical fluorescence transitions (RESOLFT), 
and stochastic optical reconstruction microscopy (STORM). 
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Modern MRI systems allow physicians to look inside the body without 
the use of ionizing radiation (see Fig. 6.1). They provide excellent soft-tissue 
contrast for morphological imaging as well as a range of possibilities for func- 
tional imaging, e.g., for visualizing blood flow, tissue perfusion or diffusion 
processes. In the following chapter, we will outline the physical fundamentals 
of magnetic resonance (MR), concepts for imaging, common pulse sequences 
to produce different contrasts as well as some advanced topics related to 
speeding up the acquisition and for functional imaging. 


6.1 Nuclear Magnetic Resonance (NMR) 


6.1.1 Genesis of the Resonance Effect 


'To explain the MR effect, we first look at the example of a compass needle 
which is subjected to a magnetic field. The needle progressively aligns itself 
along the direction of the magnetic field by oscillating around it, as shown 
in Fig. 6.2. The amplitude of the oscillation decreases over time. But the 
© The Author(s) 2018 


A. Maier et al. (Eds.): Medical Imaging Systems, LNCS 11111, pp. 91-118, 2018. 
https://doi.org/10.1007/978-3-319-96520-8 6 


92 6 Magnetic Resonance Imaging 


Figure 6.1: A modern MRI scanner can provide both morphological and 
functional imaging. Image courtesy of Siemens Healthineers AG. 


frequency of the oscillation is determined by the strength of the magnetic 
field and the properties of the needle and remains fixed over time. 

Now recall that a radio frequency (RF) wave corresponds to a magnetic 
field that varies over time. So during its oscillation, our magnetic needle can 
be seen as an antenna that emits RF waves at the frequency of its oscillation. 
These emissions stop once the needle has reached a stable position, but by 
pushing it out of balance, we can cause new RF waves to be emitted. This 
“push” can also be achieved by means of a magnetic field, one that is applied 
perpendicularly to the original magnetic field which the needle is aligned 
with. A Java applet! can simulate this process. Broadly speaking, this is the 
same principle that is applied in MRI to generate images. 

The “magnetic needles” in our body commonly used for MRI are hydrogen 
(1H) nuclei. They have an intrinsic property known as spin, visualized as the 
rotation of a sphere around an axis in Fig. 6.3, which makes them act like 
small magnets. The endpoints of the axis of rotation can be thought of as the 
poles of the magnetic needle. In the absence of an external magnetic field, the 
axes of hydrogen nuclei within the body are randomly distributed, so the sum 
of all magnetic fields is zero. Subjected to a large magnetic field, denoted by 
Bo, spin axes have the tendency to align in the direction of the magnetic field, 
similarly to the compass needles. In contrast to a compass, this alignment is 
only partial, due to random interactions between nuclei (compare Fig. 6.4). 
Even so, the sum of all spin directions no longer cancels out and will instead 
point in the direction of the magnetic field. In what follows, we will call this 
sum of all spin directions the net (total) magnetization vector M. 

Thus, the nuclei inside the body will accumulate to a net magnetization 
in the presence of a strong magnetic field Bo as the partially aligned spin 


! nttp://drcnmr.dk/JavaCompass/ 


6.1 Nuclear Magnetic Resonance (NMR) 93 


: : : s 
(a) (b) (o) (4) 


Figure 6.2: Behavior of a compass needle in a magnetic field. The needle 
oscillates through the “north” position until it reaches a stable position. In 
real compasses, the magnetic needle is immersed in a fluid to dampen such 
oscillations. 


Figure 6.3: Hydrogen nuclei are used for MRI because of their magnetic 
susceptibility and their vast amount in the human body. An intrinsic property 
of the hydrogen nuclei is their rotation (spin) which makes them magnetic 
along the rotational axis. 


axes sum up to M. Applying, for instance, a field strength of 1T (tesla) 
to a million nuclei yields a net magnetization with the magnetic strength of 
about 3 nuclei, which means that one million partially aligned nuclei make 
up for only 3 completely aligned nuclei. The induced magnetization M is 
proportional to the applied field strength of Bo and as there are over 10?” 
hydrogen nuclei in the whole body, the net magnetization accumulates to a 
measurable magnitude. 

In our compass example, we said that the magnetic needle oscillates until 
it reaches a stable position, and emits RF waves due to this oscillation. The 
oscillation of the needle is a 2-dimensional motion happening in the plane 
of the compass. For spin axes, a similar process happens, but the motion is 
3-dimensional and is called precession. From its initial position, the spin axis 
rotates around the axis of the strong magnetic field Bo. At the same time, 
the angle between the spin axis and Bo decreases over time, until they are 


94 6 Magnetic Resonance Imaging 


(a) Random spins (b) Partially aligned (c) Precession 
spins 


Figure 6.4: Nuclei axes within the body will point randomly (a) until a 
strong magnetic field forces their rotation axes to partially align with the 
applied magnetic field (b). The accumulated magnetization of all spins M 
precesses around Bo (c). 


aligned. The same motion can be observed in reverse with a spinning top, 
with the spin axis corresponding to the tilt of the spinning top and the axis 
of Bo corresponding to the direction of gravity”. As with the oscillation of 
the compass needle, the precession causes the emission of RF waves. 

We said for the magnetic needle that the frequency of the oscillation is 
determined by the strength of the magnetic field and the properties of the 
needle and that it remains fixed over time. The same holds for nuclear spin 
precession, and the frequency of the precession is 


fe=7- | Boll; (6.1) 


also known as the Larmor frequency. The gyromagnetic ratio y is the field 
strength dependent ratio for a specific nucleus, which is 42.576 MHz/T for 
hydrogen. Using, for instance, a 1.5 T field strength, protons will resonate 
with a frequency of about 64 MHz. Please note that precession should not 
be confused with spin, being the rotation of a single nucleus around its own 
axis. 

We will now abstract away from the behavior of individual spins and only 
consider the net magnetization M. Analogously to our compass needle, we 
will push M out of its stable position, a process which we call excitation, to 
cause the emission of RF waves from the body. The direction of M can be 
modified through a weaker magnetic field B, in a direction orthogonal to Bo 
by applying RF waves from a coil in the resonance frequency of M. 

In the case of M, this is not as straightforward as it was for the compass 
needle. In the 2-D plane of the compass, there is only one choice for the direc- 


2 MIT Physics Demo: https: //www. youtube . com/watch?reload=9&v=8H98BgRzp0M 
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M before excitation Bo M before excitation Bo 


(a) “World” frame of reference (b) Rotating frame of reference 


Figure 6.5: Excitation of the magnetization vector M viewed from the 
world (a) and rotating frame of reference (b). The precession is illustrated as 
a turntable, denoted by the red arrow, on which M is mounted. Viewed from 
the outside, the combination of the precession and excitation motions looks 
quite complex and the B, field rotates in tune to the precession. From the 
rotating frame of reference, only the simple path of the excitation motion is 
visible and the B, field is static. Figure recreated from [9]. 


tion of the second magnetic field, which is orthogonal to the main magnetic 
field. M precesses in a 3-D motion around Bo, so the second magnetic field 
Bı must be orthogonal to Bo as well as aligned with the current rotation 
angle of M. But we can simplify things if we change our point of view. Imag- 
ine the precession of M by picturing that the vector M is “attached” to a 
turntable which is rotating around Bo. The motion of M as it is being pushed 
out of balance and precessing at the same time seems complicated, as does 
the direction of B, that needs to be applied, see Fig. 6.5(a). Now if we step 
onto the turntable and look at the motion again, it is much simpler. In this 
rotating frame of reference, the direction of B, is constant, and we cannot 
see the precession motion, only the excitation caused by B1, see Fig. 6.5(b). 

Once the secondary magnetic field B, is turned off, the magnetization 
vector M will slowly return to the equilibrium position by a process called 
relaxation, which is described in the next subsection. During this process, 
RF waves are emitted from the body and can be received with coils placed 
near the body surface. This is the signal from which MR images are then 
generated, as outlined in Sec. 6.2. 


6.1.2 Relaxation and Contrasts 


'This section aims at explaining the concept of relaxation, which is the origin 
of contrast between different tissues in the resulting images. Relaxation is the 
process that causes the net magnetization to constantly approach equilibrium, 
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i.e., resting state again after excitation by an RF pulse. In addition to the 
dependency on the field strength, the speed of relaxation is tissue dependent 
as proton interactions are limited for large molecules or dense tissue where 
water molecule movement is hindered. This means that different tissue, e. g., 
water and fat, will end relaxation at different time points and, thus, the 
amount of signal received during relaxation time will differ. This is what gives 
rise to the contrast between tissue in MR images. To differentiate between 
different tissues, we will no longer look at the net magnetization vector M 
of all excited spins but instead overlay our imaging volume with a voxel grid 
and look at magnetization vectors per voxel. 


6.1.2.1 Relaxation 


Let's first recap the situation before relaxation starts: The per-voxel magne- 
tization vectors are initially aligned with the main magnetic field Bo, which 
we will now assume to be aligned with the z axis of our coordinate system. 
As such, each magnetization vector can be split into a longitudinal compo- 
nent M, and a transversal component M ;,, such that M = My, + M;. 
So initially, || M ,,,|| = 0 and || M;|| is a positive number dependent on the 
number of hydrogen nuclei contained within the voxel. During excitation, an 
RF pulse tips the magnetization vector into the transversal plane, such that 
I.M || = 0 and ||M,,|| is maximal. We call such an RF wave a 90° pulse 
because it changes the angle of the magnetization vector by 90?. 

Once the 90? pulse ends, relaxation occurs in the form of two indepen- 
dent processes to get back to the equilibrium state. The magnetization vec- 
tor recovers its longitudinal component, i. e., || M || tends towards its origi- 
nal magnitude. And, usually much faster, the magnetization vector loses its 
transversal component, i. e., || M || tends towards 0. These independent pro- 
cesses happen on different time scales, meaning that the magnitude of the 
magnetization vector is not constant over time. Fig. 6.6 visualizes the tra- 
jectory of a magnetization vector during relaxation. The physical reasons for 
both relaxation processes are outlined in the following paragraphs. 


Recovery of longitudinal magnetization 


is achieved by a process called spin-lattice relaxation, whereby the nuclear 
spins release the energy received from an RF pulse back into the surround- 
ing lattice (tissue), leading towards thermal equilibrium or resting state. The 
recovery of the longitudinal magnetization follows an exponential function 
I.M «(£)|| = || Mol|(1 — e-*/7:), which is characterized by a time constant T}, 
which is different for each tissue class. The time constant Tı is defined as 
time period for M ; to recover 1 — i = 63 96 of its initial magnetization Mo. 
'This characteristic number of an exponential function serves to determine the 
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M after relaxation M after relaxation 


^M before relaxation M before relaxation 


(a) “World” frame of reference (b) Rotating frame of refer- 
ence 


Figure 6.6: Visualization of the precession and relaxation of white matter 
with T; = 510 ms and T» = 67 ms in the “world” (a) and rotating frame of 
reference (b). After 90? RF excitation, the magnetization is in the transverse 
plane (red vector), which is the starting point for the relaxation process. The 
magnetization then follows the blue trajectory until the resting state (green 
vector) is reached. Figure recreated and amended from [9]. 


point in time when the process is considered “finished”. The common wisdom 
is that after 57), which corresponds to a 99.3% recovery, the process is as 
good as done. As an example, white matter has a Tı of 510 ms while it is 
2500 ms for arterial blood. The magnetization recovery for arterial blood is 
plotted in Fig. 6.7(a). 


Decay of transversal magnetization (in theory) 


The decay of transversal magnetization is caused by random interactions 
between nuclei when a perfectly homogenous magnetic field can be assumed. 
More explicitly, interactions between the magnetic fields of nuclei lead to 
temporary phase differences. Ultimately, the nuclei move out-of-phase and 
the overall signal that can be measured along Bı decreases affecting the 
transversal magnetization M,,. Again, the magnetization is an exponential 
function but this time the exponential time constant 75 determines the decay 
of || M;,(t)|| = ||Mol|e-/7*. T» is defined as the time after excitation when 
the signal value is decreased to i ~ 37% of its initial value Mo. It is de- 
pendent on the tissue density or rather the chemical structure, and, thereby, 
also characteristic for every tissue. The longitudinal magnetization decay of 
arterial blood is plotted in Fig. 6.7(b). 


98 6 Magnetic Resonance Imaging 


IM (t)|| = Mola — e=") || Mey (t)|| = ILMolle 7 
Moll T | Mol 

| 
| 
| | 
i} | 

>t >t 

Tı = 2.58 57; = 12.58 T5 = 45 ms 5T> = 225 ms 

(a) Longitudinal magnetization recovery (b) Transverse magnetization decay 


Figure 6.7: Plots of the recovery of longitudinal magnetization (a) and the 
decay of transverse magnetization (b) for arterial blood with T; = 2.5s and 
T> = 45 ms. Asa rule of thumb, the process is considered completed after five 
times the respective time constant, with over 9996 of the longitudinal mag- 
netization restored after 57, and less than 1% of transverse magnetization 
remaining after 57>. 


Decay of transversal magnetization (in practice) 


The actual decay happens more quickly, i.e., the received signal in an MRI 
acquisition decays faster than predicted by 75. This is due to imperfections 
in the homogeneity of the main magnetic field Bo, which is related to various 
effects like magnetic susceptibilities and magnet manufacture. T7 („T two 
star“) refers to the relaxation which includes the ideal tissue-dependent relax- 
ation due to random interactions between nuclei (75) plus the additional loss 
of signal due to field imperfections. Note that T? is also affected by tissue- 
dependent magnetic susceptibilities and is always shorter than T5. However, 
there exists a measurement method, called spin echo that can recover the 
signal lost through field dependent dephasing of nuclei via refocusing pulses 
(see Sec. 6.3.1). 

Fig. 6.7 compares the exponential course of Tj recovery and Tə decay, 
produced by the signal of arterial blood. Now, we can also fully explain the 
3-D visualization of relaxation in Fig. 6.6 which is described by the blue 
trajectory: the net magnetization vector Mo (green) has been completely 
tipped into the xy-plane (red vector). As relaxation starts, imagine the length 
of the red vector to decrease (Myry) with an exponential decay defined by T5, 
while at the same time the length of the green vector (M,,) grows with a 
speed defined by Tı. Above all, the magnetization vector rotates around the 
main magnetic field Bo (here, z-direction) with the field strength dependent 
Larmor frequency. 
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6.1.2.2 Contrasts 


MR images can be controlled by a large number of parameters including the 
type of sequence and numerous sequence parameters per acquisition. The 
choice of the contrast or weighting for a particular image is fundamental as 
it determines the subsequent medical application and hereby its diagnostic 
significance. There are three major contrasts being distinguished: Tı weight- 
ing, T5 weighting and proton density (PD) weighting. Weighting is used in 
the sense that the acquisition parameters were chosen such that image con- 
trast mainly reflects variations due to one of these tissue-inherent properties, 
for instance, spin-spin relaxation (T>). 


The parameters that control the particular weighting of a spin echo 
sequence are the echo time (TE) and the repetition time (TR). TE is the 
time delay after an emitted RF pulse until the RF signal is measured. In the 
meantime, transversal magnetization decay and signal loss will occur due 
to T5 relaxation, which means that the TE determines the 75 weighting of 
images. For example, a long TE compared to the T5 of the tissue being cap- 
tured yields a strong 75 contrast but only little signal as the signal decay has 
progressed for a long time. 


TR is the period of time between successive RF pulses. Several similar 
measurements are needed, for instance, to encode multiple lines per image or 
for advanced imaging protocols. An RF pulse in succession will flip parts of 
the available longitudinal magnetization into the transversal plane. During 
the following relaxation, the longitudinal magnetization builds up again with 
a speed determined by Tj. If the time between successive measurement is 
short (short TR), the available magnetization is used often and cannot re- 
cover to equilibrium yielding a relatively small signal per repetition. A long 
TR, in contrast, will produce a stronger signal as most of the longitudinal 
magnetization will have recovered by then. However, the contrast in Ti van- 
ishes entirely if the longitudinal magnetization has fully recovered before the 
measurement is taken, i.e., TR is significantly longer than T| of the tissues. 
Thus, the T1 weighting of an image is controlled by TR where a long TR pro- 
duces a signal-intense limited T} weighting and a short TR will amplify the 
variations between tissue with varying 7T; but with a generally weak signal. 

'The third type of contrast, proton density weighting, is chosen to minimize 
both T; and T, variations. What is then left are variations due to the proton 
density itself, a tissue specific property which quantifies the number of mobile 
hydrogen protons per unit volume. As the number of mobile hydrogens bound 
in water decreases slightly from pure water over fat to solids, a PD-weighted 
image allows to enhance these variations. A long TR, sufficient for the mag- 
netization to recover to equilibrium state, in combination with a short TE 
leads to a proton density weighted image. In summary, the following applies 
for sequences that consist of a simple excite-wait-measure-wait scheme: 
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short TE long TE 


short TR| T; weighting = 
long TR || PD weighting |7> weighting 


Tı weighting is determined by a short TE to minimize T effects and a short 
TR (~ Tı) at which the longitudinal magnetization has not yet recovered. 

T5 weighting can be achieved by a long TR to reduce T; impact and a long 
TE (~ T3) to allow the differences in T» decay to appear. 

PD weighting is given by a long TR such that the magnetization can reach 
equilibrium and measured immediately after the RF pulse (short TE). 


'The missing combination is long TE and short TR, which would result in 
a contrast mixture of T; and T5 with no clinical use and, additionally, a weak 
signal amplitude. 


6.2 Principles of Magnetic Resonance Imaging 


Having introduced the underlying physical phenomenon of MR, we will now 
look at the imaging component of MRI. So far, we cannot localize the source 
of an emitted radio frequency wave, but only measure the sum of the signals 
from all spatial locations affected by an excitation. 

An important component of an MRI system are the gradient coils, which 
allow us to impose a linear variation of the otherwise homogeneous mag- 
netic field Bo. Three gradient coils oriented in three orthogonal directions, 
e. g., head-feet, left-right and anterior-posterior, enable such a variation in 
any spatial direction by a weighted combination of the three. 

Two concepts based on the gradient coil system will be explained to al- 
low the spatial localization of emitted RF waves: slice selection and spatial 
encoding. 


6.2.1 Slice Selection 


An intuitively understood concept is that of slice-selective excitation. If the 
gradient coils are used to induce a linear variation of the main magnetic field 
Bo in head-feet direction, then the Larmor frequency fe of hydrogen nuclei — 
where resonance occurs — will be spatially dependent on their offset z along 
the direction: 

felz) = 7: (| Boll + 2) (6.2) 


Depending on the direction of the linear gradient, the Larmor frequency 
for !H nuclei in the feet would be lower or higher than that of nuclei in the 
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Figure 6.8: A linear variation of the magnetic field in head-feet direction 
causes the Larmor frequency fe(z) to be spatially dependent on the offset 
z along the variation direction. An excitation pulse with frequencies in the 
range of f;(z1) to fe(z2) will cause resonance in nuclei that lie within the 
depicted slice of thickness |z1 — 22|. 


head. Now modifying the frequency of the excitation RF wave allows a slice- 
selective excitation, only nuclei whose Larmor frequency matches that of the 
wave will be excited. 

If the excitation contains only a single frequency, the corresponding excited 
slice will be infinitely thin and not enough nuclei will resonate to produce a 
measurable signal. By emitting a wave containing a range of frequencies, the 
thickness of a slice can be chosen to provide a good trade-off between spatial 
resolution and signal-to-noise ratio. A visualization of this concept is shown 
in Fig. 6.8. 


6.2.2 Spatial Encoding 


Unfortunately, the slice selection method cannot be extended to encode spa- 
tial locations within a slice. Even with multiple gradient fields, we can only 
select a (possibly oblique) plane, not a single point in 3-D space. Instead, we 
make use of the phase information of spins in the transversal plane, i. e., the 
direction they are pointing. 
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6.2.2.1 One-dimensional Example 


We will illustrate this with a 1-D example “image” and later extend the 
concept to multiple dimensions. Within our example slice, there are more 
hydrogen nuclei toward the boundaries and less in the middle, represented 
by the magnitude of the magnetization vectors within each voxel. 


€«— — — ———— voxels in left-right direction ————————> 
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Directly after excitation, all spins within the excited slice point in the 
same direction. The quantity measurable by our MRI system is the net mag- 
netization, the sum of all spin magnetization vectors within the voxels of the 
excited slice: 


T-»XittT tt^ *^l^|4ls ^| ^] ^| ^in] t| t| tit 


\ net magnetization = 5 magnetizations within voxels 


In the absence of any relaxation, all spins would precess at the Larmor 
frequency implied by the magnetic field strength. Assuming a homogeneous 
magnetic field, the magnitude of the net magnetization vector would not 
change: 


By applying a linear gradient, the precession frequency of spins changes 
along the direction of the gradient, i.e., spins to the right rotate faster than 
those on the left. Spins are no longer in phase, and the phase shift between 
adjacent voxels is dependent on the strength of the applied field. This phase 
shift has an influence on the magnitude of the net magnetization vector and 
it may even become zero. A graph of the net magnetization magnitude for 
different gradient field strengths is shown below, with two example “images” 
showing different phase shifts. 
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This may seem counterproductive at first sight, but it is the core principle 
of spatial encoding. If the hydrogen nuclei distribution of the measured tissue 
matches the “pattern” implied by the phase shift, there will be a measurable 
net magnetization. The better the match, the higher the magnitude of the 
net magnetization will be, as shown here: 
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So by applying different gradients, i.e., creating different patterns in the 
phase orientation of spins, and measuring the net magnetization, i.e., the 
similarity to the applied pattern, we can get an intermediate representation 
of the underlying hydrogen density distribution. If done properly, the actual 
distribution within a slice can then be reconstructed from this intermediate 
representation. Note that this step happens after slice selection, i.e., the 
phase encoding is only applied to spins within the slice of interest. 

'The phase pattern can also be understood in terms of an intensity pattern, 
by mapping phase angles to gray values. For the two phase patterns shown 
above, the corresponding intensity patterns are shown here: 
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The observations above can also be derived mathematically and demon- 


strate that we are actually measuring the Fourier transform of the signal (cf. 
Geek Box 6.1). 
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Geek Box 6.1: Relation to 1-D Fourier Transform 


The phase pattern can be described as a function which maps the 
offset in right-left direction z to a complex number of magnitude 1 
and an angle dependent on x and the phase shift k corresponding to 
the gradient field strength: 


pi (x) = el" = cos(kz) + isin(kz) (6.3) 


The match of the pattern p(x) to our image f(x) is performed by a 
pointwise multiplication and summation, i. e., a correlation. The result 
is the measured net magnetization m(k) dependent on the phase shift 
k: 


m(k) = | fæla) dz (6.4) 

EO dz (6.5) 

We can now see that the image f(x) we want to compute is the 
Fourier transform of the measured net magnetization m(k), which is 


how it can be reconstructed. In other words, phase encoding performs 
a Fourier decomposition of our image. 


6.2.2.2 Generalization to Multiple Dimensions 


'The concept of phase encoding effortlessly generalizes to more dimensions. 
For example, to perform spatial localization in two dimensions within an 
excited slice, e. g., left-right and anterior-posterior, two gradient fields in those 
directions are applied to create 2-D phase patterns: 
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The concept can also be expanded to n-D and comprises again a Fourier 
transform (cf. Geek Box 6.2) 
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Geek Box 6.2: Relation to n-D Fourier Transform 


In general, the phase shift pattern for an n-D image f(x), x € R”, 
dependent on the phase shift in each dimension combined into a vector 
k is 


pe( x) = gu — cos M mjk; +isin D ck; (6.6) 
= = 


and the measured net magnetization m(k) is the n-dimensional inverse 
Fourier transform of f(x): 


(6.7) 


(6.8) 


In the literature, a differentiation is often made, naming one of the con- 
sidered dimensions the frequency encoding direction and the remaining n — 1 
dimensions the phase encoding directions. The process of spatial encoding is 
then explained as two separate steps, frequency encoding and phase encoding. 
However, the idea behind both is the same - the one outlined above — and 
the differences are only due to the technical procedure of reading out data 
with the scanner, which is omitted here. 


6.2.3 k-space 


In the magnetic resonance community, Fourier space is often referred to as 
k-space as a reference to the wavenumber k. The purpose of an MRI exami- 
nation is to fill the k-space with data so that an image can be reconstructed 
from it. Fig. 6.9 shows an example for a 2-D k-space with some associated 
phase patterns. 


6.2.4 Slice-selective vs. Volume-selective 3-D Imaging 


Having understood the concepts of slice selection and spatial encoding, two 
options present themselves for 3-D imaging: 
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Figure 6.9: Example of a filled 2-D k-space, with grey values encoding 
similarity as dark = low and bright = high, with associated phase patterns 
for some k-space positions. 


Slice-selective: Use slice selection to successively acquire, reconstruct, and 
stack 2-D slices of the desired 3-D volume, using 2-D spatial encoding for 
localization within each slice. 

Volume-selective: Excite the entire volume without slice selection and use 
3-D spatial encoding for localization, followed by a 3-D reconstruction. 


Both approaches have advantages and disadvantages and the choice for 
which approach to use is dependent on the intended use of the acquired 
volume. Slice-selective acquisitions are commonly non-isotropic with high in- 
plane resolutions < 1mm, but with a high slice thickness of several mil- 
limeters. Volume-selective acquisitions have an inherent signal-to-noise ratio 
benefit because more 'H nuclei resonate. They typically feature isotropic res- 
olution which lies inbetween the in-plane resolution and slice thickness of 
slice-selective acquisitions (for comparable acquisition durations). 


6.3 Pulse Sequences 


A pulse sequence describes the sequence of RF pulses which are applied in 
repetition in order to successively acquire the whole k-space of an object. This 
includes order and position of every sample as well as the information how the 
3-D gradient coils of the scanner hardware have to be adjusted accordingly. 
Sequences can look quite complex in a detailed view but they are usually 
determined by a small number of recurring building blocks such as (partial) 
flips of longitudinal magnetization via RF excitations, waiting periods and 
readout gradients. 
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In the following, we focus on the description of the general concepts of two 
prominent pulse sequences. 


6.3.1 Spin Echo 


The spin echo (SE) pulse sequence is widely used as it allows to regain the 
signal loss due to field imperfections. As we discussed earlier in Sec. 6.1.2, 
the decay of transverse magnetization after excitation is subject to dephasing 
due to random interactions and external field inhomogeneities. Apparently, 
we cannot influence the random interactions between nuclei but we can alter 
the phase of nuclei using an 180? inversion pulse such that dephased nuclei 
can end up in phase again. For this to work, constant field imperfections are 
assumed which holds in practice for conventional MRI acquisitions. The prin- 
ciple of the spin echo sequence is shown in Fig. 6.10 and can be summarized 
to: 


1. A 90? RF pulse flips the magnetization into the transversal plane. 

2. Dephasing due to random nuclei interactions and field inhomogeneities 
sets in. Some nuclei will spin slightly faster and others slower due to local 
field variations. The phases of these nuclei will further diverge over time 
such that their magnetic moments will cancel each other, resulting in a 
decay of transversal magnetization. 

3. After a waiting period of TE/2, a 180? pulse inverts the magnetization by 
flipping the dephased vectors along the x (or y) axis. In consequence, the 
nuclei whose phase trailed behind are now ahead of the main magnetic 
moment and vice versa. 

4. After another waiting period of TE/2, all magnetic moments have refo- 
cused and are in phase again as the faster spins have caught up with the 
lower spinning nuclei by this time. Now, a large signal, so called spin echo, 
which is of negative sign but with the T% effects removed can be measured. 

5. Multiple echos can be formed by repeating steps 2-4 as long as some 
signal due to 75 decay is available. Each signal has its own echo time TE, 
TE2, ... after the 90° RF excitation. 


6.3.2 Gradient Echo 


The gradient echo (GRE) pulse sequence utilizes partial flips with angles be- 
low 90?, which allow for faster acquisitions compared to SE sequences. Note 
that the acquisition time for a single slice in a typical spin echo sequence is 
given by TR- Ny - Nex, where Ny and Nex are the number of phase encoding 
steps and excitations, respectively. As these parameters determine the result- 
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Figure 6.10: A 90° pulse tips the magnetization into the zy plane (a). 
Some nuclei spin slightly faster (dashed) and others slower (dotted) than the 
resonance frequency due to local field variations (b). This process continues 
and leads to a reduction of measurable signal (green line on timeline) along 
B, (c). A 180° pulse inverts the magnetization vectors at t = TE and the 
spins start to rephase again (d). The total magnetization builds up as the 
magnetization vectors become in phase (e) and reaches its peak at t = TE 
(f). This yields the first echo of the original signal, indicated by the red dot 
in the timeline. 


ing resolution and SNR, one wishes to reduce the scan time by selecting a 
TR as small as possible that still yields enough signal. However, with a typ- 
ical 90° pulse the longitudinal magnetization M, cannot recover sufficiently 
for very short TR. Thus, the trick is to use low-flip angles which tip only 
parts of the longitudinal magnetization into the transversal plane such that 
enough longitudinal magnetization is available for the next repetition after 
a short TR. The flip angle a of the RF pulse directly controls the resulting 
magnitude of the transversal magnetization ||M,,|| = || Mol| sino, and the 
residual longitudinal magnetization ||M || = || Mo|| cosa, where Mo is the 
initial magnetization. 
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Another difference to the spin echo sequence is that GRE uses no 180° refo- 
cusing pulse which makes it more susceptible to field inhomogeneities leading 
to a T5 weighting instead of a Ty weighting. 

Also, as there is no 180? pulse, the echo in a GRE sequence is formed via 
a negative gradient in readout or frequency encoding direction which is an 
intended dephasing of the magnetization. The idea behind it is that since 
some time for preparations such as the spatial encoding is needed before the 
actual signal can be read out, we intentionally delay the time of the peak 
signal (echo) to a more convenient time. To this end, the dephasing gradient 
has the inverse sign of the readout gradient and is applied in advance for half 
of the time of the readout gradient. This ensures that the maximal signal 
can be obtained at the half of the readout period, since the positive readout 
gradient reverses the effects of dephasing and recalls the signal during the 
first half while it gradually dephases again during the second half. This is 
where the name gradient recalled echo stems from. 

Yet another characteristic of the GRE sequence is the formation of a steady 
state. In contrast to spin echo sequences, the GRE can have such a short TR 
that the signal decay due to T% is incomplete and some transversal magneti- 
zation remains when the next RF pulse follows. In consequence, transversal 
magnetization accumulates over a few cycles which is referred to as steady 
state. As the steady state may be unfavorable for some applications, the so 
called spoiled GRE sequence tries to eliminate the residual transverse magne- 
tization. Otherwise its effects will manifest itself in the image contrast. More 
on this is subject to further reading [3]. 


6.4 Advanced Topics 


Up to this point, the principles for morphological imaging have been intro- 
duced. We will now look at some advanced topics related to speeding up the 
acquisition process, suppressing signals from unwanted or enhancing signals 
from desired tissue classes as well as methods for functional imaging. 


6.4.1 Parallel Imaging 


Long acquisition times are a major drawback of MRI systems, with manifold 
negative consequences. Patients may experience discomfort, having to spend 
extended amounts of time in a narrow space. There is an impact on image 
quality, as patient motion during the acquisition is inevitable. And from a 
financial standpoint, the amount of MRI examinations per unit time is much 
less than for other modalities. 
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Figure 6.11: Parallel imaging uses local receiver coils placed around the 
patient which capture the signal from multiple positions, allowing for a re- 
duction of acquisition time. The coils are embedded in the table as well as in 
the flexible coil on top of the patient’s chest, in this case an 18-element body 
coil where individual coil elements are arranged in 6 columns and 3 rows. 
Three individual coil elements are exemplarily indicated by the red circles. 
Image courtesy of Siemens Healthineers AG. 


Techniques for shortening the acquisition time are, thus, of vital impor- 
tance and an active field of research. Parallel imaging is an established tech- 
nique which allows a reduction of the amount of data needed to be acquired 
in order to reconstruct an image. The name stems from the use of multiple 
local receiver coils which are placed around the patient and acquire resonance 
signals in parallel. Modern MRI systems have coil elements which are em- 
bedded in the table as well as flexible coil elements which can be placed on 
the patient (see Fig. 6.11). 

This enables many possibilities for undersampling in k-space to reduce 
the acquisition time, one of which we will present here. Suppose we regularly 
undersample the k-space by acquiring only every n-th line. The value n is 
referred to as the undersampling factor, indicating the reduction in acquisi- 
tion time. A standard reconstruction by performing a Fourier transform is 
no longer possible in this case because it will introduce aliasing artifacts in 
the reconstructed image (see Fig. 6.12). 

A reconstruction technique called sensitivity encoding (SENSE) is able to 
reconstruct an image without aliasing artifacts from such undersampled data 
by employing the information collected from the multiple receiver coils. It 
exploits the fact that each local receiver coil “sees” a slightly different image, 
namely one in which parts closer to the coil are better represented than those 
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Figure 6.12: A visualization of a fully sampled k-space (a) with its cor- 
responding reconstruction (b) as well as a regularly undersampled k-space 
with every second line missing (c) with a reconstruction showing aliasing 
artifacts (d). 


farther away. A so-called coil sensitivity map s,(r,) describes how well a coil 
y sees a pixel at position rp. 

In case of regular undersampling, each pixel in the aliased image (see 
Fig. 6.12(d)) can be described as a weighted sum of several pixels in the 
unaliased image (see Fig. 6.12(b)). In the given example, 2 pixels in the una- 
liased image contribute to a pixel in the aliased image. Different local receiver 
coils “see” different aliased images, where the weights for the weighted sum 
of pixels are described by the coil sensitivity map of the respective coil. Geek 
Box 6.3 illustrates this for an idealized 2-coil setup whereas Geek Box 6.4 
discusses how to calibrate the sensitivity maps online. 


112 6 Magnetic Resonance Imaging 


Geek Box 6.3: Coil Sensitivity Maps 


The parallel measurement can be described by a linear system of equa- 
tions: 
a= Sv, (6.9) 


where v is a vector of length m of unaliased pixel values contributing 
to an aliased pixel, a is a vector of length c of the aliased pixel value 
as seen by the c different coils and S is a c x m matrix containing the 
coil sensitivities as 

S4, = 84(Tp), (6.10) 


where y is the coil index and r, are the pixel positions of the pixels 
contained in v. The reconstruction then consists of solving Eq. (6.9) 
for all pixels in the unaliased image. 

This process can be visualized looking at idealized coil sensitivity maps 
for a 2-coil setup where one coil is more sensitive in the upper part of 
the image and the other in the lower part of the image: 


Sensitivity map of Coil 1 Aliased image of Coil 1 


Sensitivity map of Coil 2 Aliased image of Coil 2 


Here, brighter values represent higher weights. The aliased images 
seen by these coils differ, one has a better representation of the top 
part of the head and one a better representation of the bottom part, 
see the red highlights. 
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Geek Box 6.4: Coil Sensitivity Map Calibration 


A remaining problem is how to compute the coil sensitivity maps. An 
a-priori calibration is infeasible as they are dependent on the imaging 
volume. Therefore, we describe an auto-calibration approach can be 
used to determine coil sensitivity maps during the scan. 

In addition to the undersampled acquisition as shown in Fig. 6.12(c), 
a small, fully sampled region around the center of k-space is measured. 
A direct Fourier transform reconstruction of this region for each coil 
leads to low-resolution versions of the volume as seen by the respective 
coil, for our two-coil example: 


Acquisition mask Image of Coil 1 Image of Coil 2 


An approximation of the coil sensitivity maps is obtained by dividing 
these images by a sum-of-squares combination of all coil images: 


Ve) (6.11) 


where i, (rp) is the image of coil y and C is their total number: 


B aper of Coil 1and 2 Sensitivity Map 1 SUE Map 2 
V XS iy (r9? iiy (rp)? 81(Tp) 52(Tp) 


The resulting coil sensitivity maps show artifacts in the regions with 
little to no signal as compared to the idealized maps. Reconstructions 
using these coil sensitivity maps will display noise amplification in air 
regions due to these artifacts. More advanced methods exist to deal 
with these issues, but they are beyond the scope of this text. 
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Figure 6.13: A T5 preparation sequence can be used to increase the contrast 
between myocardium and arterial blood due to their large difference in To 
constants (35 ms vs. 250 ms). Figure recreated from [6]. 


6.4.2 Spectrally Selective Excitation 


For bright-blood coronary imaging, it is often desirable to increase contrast 
between the myocardium (the heart muscle) and the arterial blood within the 
heart chambers. This allows a better delineation of the heart walls and thus 
increases the diagnostic value of images. A method called 75 preparation 
exploits the fact that arterial blood has a much larger Tə constant than 
myocardial tissue and venous blood. This means that the decay of transversal 
magnetization M ;, is slower for arterial blood, see Fig. 6.13. 

Using a sequence of pulses, we can reduce the magnetization of the my- 
ocardium and venous blood while keeping the magnetization of arterial blood 
virtually untouched. Fig. 6.14 illustrates the process. In the initial state, spins 
of all tissue types are aligned with the main magnetic field Bo. A 90? pulse 
pushes the spins into the transversal plane, where they precess. At this point, 
Ti, T5 and T7 relaxation (see Sec. 6.1.2) start to affect the spins: 


T relaxation: We will ignore T; relaxation here because it happens on a 
much larger timescale than the duration of the 75 preparation sequence. 

Tj relaxation: Due to magnetic field inhomogeneities, spins would rapidly 
dephase and lose their transversal magnetization, known as 75 relaxation. 
To counteract this effect, we apply a series of 180° pulses to refocus the 
spins. 

Tə relaxation: This is the effect we actually want to happen. While the 
spins remain in the transversal plane, myocardium and venous blood spins 
lose their transversal magnetization much faster than arterial blood. 


After a set amount of time, a final —90? pulse realigns spins with the main 
magnetic field. At this point, arterial blood spins have a higher magnetization 
and subsequent imaging sequences (e.g., GRE or spin echo sequences) will 
show an increased contrast to myocardial tissue. 
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Figure 6.14: In the initial state, the spins for myocardial tissue, venous 
and arterial blood are all aligned along the main magnetic field. A 90? pulse 
pushes them into the zy plane, where they precess (a). 180? refocusing pulses 
are used to counteract T% relaxation, so only T5 relaxation affects the spins 
while they are in the zy plane (b), (c). Due to the different T> constants, 
the transversal magnetization decays faster for myocardial and venous blood 
spins. After a set amount of time, spins are realigned with the main mag- 
netic field by a —90? pulse (d). An imaging sequence following this prepara- 
tion sequence will now display increased contrast between arterial blood and 
myocardial tissue. Figure recreated from [6]. 


6.4.3 Non-contrast Angiography 


Up to this point, we have quietly assumed that all measured spins remain 
stationary for the duration of the imaging. This assumption is, of course, 
invalid if we image the human body. For many applications, we have to adapt 
the acquisition protocol to minimize artifacts due to motion. For example, we 
may ask the person being scanned to hold their breath to minimize respiratory 
motion artifacts. Imaging sequences with a short TR can be used to “freeze” 
cardiac motion. But in some cases, we can actually use non-stationary spins 
to our advantage. 

Angiography, the imaging of blood vessels, is commonly performed by 
administering a contrast agent which increases the contrast of blood to sur- 


116 6 Magnetic Resonance Imaging 


saturated saturated 
imaging imaging 
plane plane 
blood blood 
vessel i vessel 
Q D Q ™ > 
ee ; ————À 
flow : flow 
direction : direction 
(a) Directly after saturation (b) Unsaturated spins enter the imag- 


ing plane due to blood flow 


Figure 6.15: For non-contrast TOF angiography, all spins in the desired 
imaging plane are saturated, i.e., put in a state such that they cannot be 
excited by an RF pulse, indicated by the gray shading (a). Due to blood 
flow, unsaturated spins enter the imaging plane for blood vessels that are 
not entirely parallel to the imaging plane (b). Subsequent imaging sequences 
will show a high contrast between those vessels and surrounding stationary 
tissue. 


rounding tissues. In MRI, we can use the fact that blood spins move contin- 
uously during the image acquisition to perform a non-contrast angiography. 
In the magnetic resonance context, the T'OF effect refers to the short amount 
of time that flowing blood spins remain within an imaging slice. 

For TOF angiography, we first saturate all spins within our imaging plane, 
i.e., put them in a state where they cannot be excited by an RF pulse. For the 
duration of their T; relaxation, stationary spins within the imaging plane will 
thus show little to no signal if we perform an imaging sequence. But due to 
the TOF effect, unsaturated blood spins continuously flow into the imaging 
plane and will appear bright in contrast to the surrounding stationary tissue 
if imaged. 

'This imaging technique works best for blood vessels perpendicular to the 
imaging plane. For vessels that lie within the imaging plane, the contrast 
becomes weaker with increasing distance to the point where unsaturated spins 
enter the imaging plane. Fig. 6.15 illustrates the concept of TOF angiography. 
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6.4.4 The BOLD Effect 


One of the applications of functional magnetic resonance imaging (fMRI) is 
the visualization of neuronal activity in the brain. Increased neuronal activity 
leads to local oxygen depletion in the active regions of the brain. This lack 
of oxygen is subsequently overcompensated, leading to a higher concentra- 
tion of oxygenated blood in the active regions. Thus, an increased oxygen 
concentration can be seen as an indication of neuronal activity. 

Our aim is to measure this increased concentration using the blood- 
oxygenation-level dependent (BOLD) effect, which describes the different 
magnetic properties of oxygenated and deoxygenated hemoglobin. Blood con- 
taining higher concentrations of oxygenated hemoglobin has a higher T% con- 
stant, i.e., less dephasing due to local magnetic field inhomogeneity (see 
Sec. 6.1.2). Thus, to measure the neuronal activity due to an external stim- 
ulus, we can compare images acquired in a resting state to images acquired 
during the application of the stimulus to see which brain regions experience 
a change in local oxygen concentration. 

A GRE sequence (see Sec. 6.3) can be used to obtain T7 weighted images. 
However, as the T% differences are very slight, the usual approach to gain 
robust results is to acquire multiple resting state images and multiple images 
while the stimulus is applied in an alternating fashion, followed by a statistical 
test to determine if the intensity of a given pixel significantly differs in both 
sets of images. 
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7.1 Introduction 


In this chapter, the physical principles of X-rays are introduced. We start 
with a general definition of X-rays compared to other well known rays, e. g., 
the visible light. In Sec. 7.2, we will learn how X-rays can be generated and 
how they can be characterized with respect to their energy. The most relevant 
concept to understand how X-ray imaging works is the behavior of X-rays 
when they interact with matter. This is outlined in detail in Sec. 7.3. In 
Sec. 7.4, conventional X-ray imaging is described with a focus on detector 
types and sources of noise. Finally, we finish this chapter with an overview 
of well known application areas for X-ray imaging in Sec. 7.5. 


7.1.1 Definition of X-rays 


X-rays belong to the group of electromagnetic rays, hence, they follow the 
rules of electromagnetic radiation. Electromagnetic radiation transports en- 
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Figure 7.1: Wavelengths and frequencies of the different groups of electro- 
magnetic radiation. X-rays lie in the range of 0.01 nm up to 10 nm. 


ergy, also called radiant energy, through space by waves and photons, just as 
radio waves, the visible light or microwaves. It can either be represented by 
photons or by a wave model. We will use both representations in the process 
of this chapter. Radiation can be classified by its wavelength A, which is the 
length of one period of the wave. The wavelength can also be represented by 
frequency fp and the waves propagation speed, i.e., the speed of light co. 


Co 


E? 


In Eq. (7.2) the energy of photons is given, where h denotes Planck's constant 
(726.626 069 x 10734 Js) and co is the speed of light (222.997 92 x 10? m s-!). 
The energy is directly related to the photon's wavelength A, or its frequency 
fp and is given by the unit electron volt [eV]. We can easily obtain that the 
photon energy is proportional to its frequency and inverse proportional to its 
wavelength, that means the higher its frequency, the higher its energy. 


dp (7.1) 


h 
E, = m sd (7.2) 


The energy is also used to eee: electromagnetic radiation into dif- 
ferent groups, i.e., radio waves, microwaves, infrared (IR), visible light, ul- 
traviolet (UV) light, X-rays and 4-rays. Fig. 7.1 shows these groups with 
respect to their characteristic ranges of frequency and wavelength. Note that 
the wavelength of most X-rays lies in the range of 0.01 nm up to 10 nm. This 
corresponds to an energy range of 100 keV down to 100eV. 

As visible light, X-rays loose a certain amount of energy when they pass 
through different materials. The energy loss depends on the absorption be- 
havior of the material. For example if X-rays pass through 10cm of water, 
they loose less energy than if they would pass trough 10cm of bone. The 
reduction of energy is caused by absorption which is the main principle of 
traditional X-ray imaging. Generally speaking, X-ray radiography measures 
the amount of energy loss. Because this energy loss differs for the different 
materials, we can see a certain contrast in the image. For example an X-ray 
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image shows high intensities for soft tissue and lower intensities where the X- 
rays passed through bones. Note that the absorbed energy is directly related 
to the dose that is delivered to the patient during an acquisition. 


7.1.2 History and Present 


Discovery of X-rays 


X-rays have been discovered by Wilhelm Conrad Róntgen in Würzburg, Ger- 
many. On November 8, 1895, he conducted experiments including Crookes 
tubes, which are typically used to visualize streams of electrons. He further 
used a fluorescent screen and covered the actual tube with black cardboard. 
When moving the fluorescent screen away from the tubes opening he realized 
that there was still a glimmer visible on the fluorescent screen, which had to 
be the result from radiation that passes through the black cardboard. Addi- 
tional experiments where he replaced the cardboard with denser materials, 
e. g., books led to the same observation. After that, he began a systematic 
study of the new radiation, which he then named *X"-rays. One of the first 
acquired X-ray images is shown in Fig. 7.3. It depicts the hand of Róntgen's 
wife, where we can clearly see the ring she was wearing on her annular finger. 
It is not to be confused with a similar image depicted in Fig. 7.4 which was 
taken later in January of 1896. 

Only on December 28, 1895, about six weeks after the first discovery, 
Röntgen submitted the first known article on X-rays entitled “Uber eine 
neue Art von Strahlen" (On a new type of rays) which shows first reports 
on the absorption properties of different materials, e. g., paper, wood and 
also metal. Already in January 1896, Róntgen demonstrated his discovery 
to the German medical-physical society. Creating an X-ray of Albert von 
Kolliker's hand (cf. Fig. 7.4) — a well-known anatomist at that time — in front 
of the audience immediately convinced Róntgen's colleagues of the utility of 
his invention. For his groundbreaking discovery, Róntgen received the first 
awarded Nobel Prize in Physics in 1901. In Fig. 7.2, we can see an image of 
Wilhelm Conrad Róntgen, taken for the Nobel-Prize committee. The actual 
commercial implementation was performed by others (cf. Geek Box 7.1). 


X-rays Today 


'Today, X-rays are routinely used in diagnostic but also in interventional med- 
ical imaging around the globe. Also in industry, X-rays are often the method 
of choice, for example to test for very small cracks in metal parts in the field 
of non-destructive testing. In medical imaging, a variety of applications have 
been developed that go far beyond simple radiographic imaging. For example 
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Geek Box 7.1: Commercial Success of X-rays 


Röntgen donated his discovery to humanity and never made any com- 
mercial profit. He also never filed a patent for his invention. The actual 
commercial roll-out of the technology was performed by several small 
companies: 


e In 1877, Erwin Moritz Reiniger, a former employee of Friedrich- 
Alexander-University Erlangen-Nuremberg, Germany, founded a 
small workshop right next to the university with the aim of produc- 
ing batteries and physical measurement devices. In 1886, Max Geb- 
bert and Karl Friedrich Schall joined Reiniger’s workshop found- 
ing the Vereinigte Physikalisch-Mechanische Werkstätten Reiniger, 
Gebbert & Schall — Erlangen, New York, Stuttgart. In 1896, they 
switched the focus of production to X-ray tubes. Over the years this 
small company grew and is today known under the name Siemens 
Healthineers AG. 

Victor Electric Company was founded in 1893 by C. F. Samms and 
J. B. Wantz in a basement in Chicago, United States of America, 
with the aim of producing physical measurement gear. In 1896, 
they also began with the production of X-ray tubes. Also this small 
company turned out to be very commercially successful and is today 
known as General Electric Healhcare. 

In 1896, C. H. F. Müller developed the first commercial X-ray tube 
in Hamburg, Germany, in cooperation with the University Clinic 
Hamburg-Eppendorf. In 1927, the company was bought and is today 
an integral part of Philips Medizin Systeme GmbH. 


fluroscopy allows for real time X-ray sequences which are often inevitable 
in minimally invasive interventions. Further, digital subtraction angiography 
(DSA) provides an effective tool to visualize even small vessel structures. In 
the 1970s, the step to CT was done which now allows to visualize the com- 
plete human body in three dimensions. Another point of rapid development is 
the awareness that X-rays can also be harmful. High energies emitted to the 
body during an X-ray acquisition can lead to ionization. That means the ra- 
diation changes the atomic structure of the tissue which can potentially lead 
to an increased risk for the development of cancer. Here, deoxyribonucleic 
acid (DNA) becomes damaged by the radiation. In most cases, the DNA will 
be repaired by the cell itself. Yet, the repair process sometimes fails which 
in some cases leads to an unregulated division of cells that might result in 
cancer. X-ray-based foot scanners where still in use to measure foot sizes in 
shoe stores until the 1970s. Nowadays, the majority of people is aware of the 
risk posed by X-rays and the transmitted patient dose during X-ray scans 
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Figure 7.2: Wilhelm Figure 7.3: One ofthe Figure 7.4: Image 


Conrad Róntgen first X-rays, taken from taken in 1896 showing 
Wilhelm Röntgen of his Albert von Kólliker's 
wifes hand. hand. 


has been significantly reduced in the past decades!. In Fig. 7.5, we can see 
an X-ray taken in mammography which typically uses low-energy X-rays to 
increase soft-tissue contrast. Another example is shown in Fig. 7.6, where we 
can see an X-ray image taken from the thorax, i.e., the chest, of a patient. 


7.2 X-ray Generation 


A classical X-ray tube is depicted in Fig. 7.7. An X-ray tube is basically an 
evacuated tube made of glass with a cathode and a solid metal anode in it. 
Thermionic emission occurs by the heated filament at the cathode. Heat in- 
duced electrons e~ are produced because the thermal energy applied to the 
filament material is larger than its binding energy. Then, the electrons are 
accelerated by the tube’s acceleration voltage between the negative cathode 
and the positive anode. When those fast electrons hit the anode, they are 
decelerated and deflected by the electric field of the atoms of the anode ma- 
terial. Any acceleration of loaded particles results in electromagnetic waves. 
So does the slowing down, i.e., the negative acceleration, of the electrons in 
the metal anode. It generates X-rays. 

The anode is tilted by a certain angle to direct the emerging X-rays in the 
right direction. Typically each electron is slowed down or deflected several 
times so it causes the creation of several photons. However it can also happen 


1 Although being full of mistakes (as claimed by the authors), a good overview on 
radiation doses is found at https://xkcd.com/radiation/. 
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Figure 7.5: Example of Figure 7.6: Example of an X-ray taken 
a breast X-ray. Note the from the chest of a patient. 

clearly visible structures in 

the soft-tissue. 


Cathode XS 


Acceleration Voltage 


Figure 7.7: Vacuum X-ray tube: The image on the left shows a schematic 
how electrons are accelerated from the cathode to the anode to genereate 
X-ray photons. The image on the right shows a historic vacuum X-ray tube. 
Image provided by Science Museum, London, Wellcome Images under Cre- 
ative Commons by-nc-nd 2.0 UK. 


that an electron loses all its velocity and thus its energy in one step. In 
this case, only one photon containing the complete energy of the electron is 
created. 

The production of X-rays is caused by two different processes as shown 
in Fig. 7.8. First, if the electron interacts with an inner-shell electron of the 
target, characteristic X-radiation can be produced. This kind of X-rays results 
from a sufficiently strong interaction that ionizes the target atom by a total 
removal of the inner-shell electron. The resulting “hole” in the inner-shell is 
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filled with an outer-shell electron. The transition of an orbital electron from 
an outer-shell to an inner-shell is accompanied by the emission of an X-ray 
photon, with an energy equal to the difference in the binding energies of the 
orbital electrons involved. Therefore, the characteristic radiation produces 
a line spectrum, or discrete spectrum. Obviously, this kind of radiation is 
material dependent. Both the production of characteristic X-rays as well as 
thermal energy involve interactions between the accelerated electrons and the 
electrons of the target material. 

Another type of interaction in which the electron can lose its kinetic energy 
delivers the second process of X-ray production, caused by the interaction of 
the electron with the nucleus of a target atom. As the colliding electron 
passes by the nucleus of an anode atom, it is slowed down and deviated in 
its course, leaving with reduced kinetic energy in a different direction. This 
loss in kinetic energy reappears as an X-ray photon. This type of X-rays 
is called Bremsstrahlung, where “bremsen” is the German verb for slowing 
down. The amount of kinetic energy that is lost in this way can vary from 
zero to the total incident energy. While the characteristic radiation results 
in a discrete X-ray spectrum of characteristic peaks, the Bremsstrahlung 
provides a continuous spectrum. The number of X-rays emitted decreases 
rapidly at very low photon energies. The spectrum of a tungsten source is 
given in Fig. 7.8. In medical imaging, very low energies of an X-ray spectrum 
are typically removed prior to an interaction with the patient by using a 
thin metal plate which is placed between the patient and the X-ray source. 
The reason for this is that almost all of the low energy photons would be 
absorbed by the patient, thus, leading to an increased patient dose without 
a substantial improvement of image quality. The metallic plate is also called 
X-ray filter, which is not to be confused with the mathematical filters used 
for image processing. 


7.3 X-ray Matter Interaction 


X-rays have the ability to penetrate matter, yet, the amount of penetrating X- 
ray photons is material-dependent. Their ability to penetrate human tissue 
is in fact the reason why they can be used to get information on internal 
organs. Different tube voltages between the cathode and the anode produce 
higher or lower energy X-ray spectra. In the energy range that is used for 
medical imaging, there are three kinds of relevant interactions that can occur 
when X-rays pass through matter: 


e interaction with atomic electrons, 

e interaction with nucleons, 

e interaction with electric fields associated to atomic electrons and atomic 
nuclei. 
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Figure 7.8: X-ray spectrum of a tungsten tube. The peaks correspond to 
the characteristic radiation; the continuous part of the spectrum represents 
the Bremsstrahlung. 


Consequently, the X-ray photons either experience a complete absorption, 
elastic scattering or inelastic scattering. 

The interaction that is used for medical imaging consists of a reduction 
of radiation intensity which is nothing else than a reduction of the number 
of photons that arrive at the detector. That process is usually referred to 
as attenuation. There are several different physical effects contributing to 
attenuation, including a change of the photon count, photon direction, or 
photon energy. All of these effects have in common that they are based on 
an interaction between single photons and the material that they are passing 
through and that the attenuation induced by each of them is highly energy- 
dependent. Sec. 7.3 shows an overview on the different relevant effects. Note 
that pair production is not relevant for typical diagnostic X-ray energies. To 
produce a positron and an electron, the photon’s energy must exceed at least 
2 x 511 keV. 


7.3.1 Absorption 


When a monochromatic X-ray beam traverses a homogeneous object with 
absorption coefficient u, according to Lambert-Beer’s law, the observed in- 
tensity I is related to the intersection length of the object x and the ray: 
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Figure 7.9: Principles of photon-matter interactions. 


I=h e. (7.3) 


Here, Io is the X-ray intensity at the X-ray source. A derivation is found in 
Geek Box 7.2. 

In X-ray CT, the fractional transmitted intensity I/Ip is used to measure 
a large number of ray paths through the object. The logarithm of this ratio is 
used to obtain a set of line integrals as an input to reconstruction algorithms. 
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Geek Box 7.2: Lambert-Beer’s Law 


The radiation intensity decreases which leads to an ordinary linear and 
homogeneous, first order differential equation with constant coefficient 


dl 

= e. 

qud 

I is the intensity of the incident radiation, dx is the thickness of the 
material and yu is the material attenuation coefficient. jj mainly con- 
sists of contributions from the photoelectric absorption effect and the 
Compton scattering. If we take the integral on both sides we obtain 


A dd 
i Ta) dr = pda 
log I(x) — log I(0) = —px . 


To solve the logarithms in the equation, we take exponential of both 
sides. Thus, the equation can be rewritten as 


In general, we can define Jp as the energy of the incident beam and I as 
the energy after the beam traversed through material with thickness 
25: 


When the ray passes through inhomogeneous objects, factor u(x) is the linear 
attenuation coefficient at each point on the ray path 


I 
rd = mc ; (7.4) 


However, in practical setups, the emitted X-ray photons have various energies, 
resulting in polyenergetic energy spectra as shown in Fig. 7.8. The measured 
intensity of a polychromatic beam J on the detector can be written as the 
sum of monochromatic contributions for each energy E in the X-ray spectrum 
(E € [0, Emaz]). The attenuation coefficient p is also energy dependent. When 
polychromatic X-rays are taken into account, we get 


I= | I(E) exp (- | ule, Bax) dE (7.5) 


where Io(E) is the normalized energy spectrum, i.e., f Io(E)dE = 1. 
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7.3.2 Photoelectric Effect 


The photoelectric effect was originally described by Einstein under the es- 
tablishment of the quantized nature of light. It describes a situation in which 
the incident X-ray photon energy is larger than the binding energy of an 
electron in the target material atom. The incident X-ray photon gives up its 
entire energy to liberate an electron from an inner-shell. The ejected electron 
is called photoelectron. The incident photon then ceases to exist. The photo- 
electric effect often leaves a vacancy in the inner shell of the atom, where the 
ejected electron was previously located. As a result, the “hole” created in the 
inner-shell is filled by an outer shell electron. Since the outer shell electron 
is at a higher energy state, a characteristic radiation occurs. Therefore, the 
photoelectric effect produces a positive ion, a photoelectron, and a photon of 
characteristic radiation. For tissue-like materials, the binding energy of the 
K-shell electrons is very small. Thus, the photoelectron acquires essentially 
the entire energy of the X-ray photon. 


7.3.3 Compton Scattering 


The second type of matter interaction is the Compton scattering, which is 
named after Arthur Holly Compton, who received the Nobel Prize in 1927 
for his discovery. For high X-ray energies, Compton scattering is the most 
dominant interaction mechanism in tissue-like materials. The energy of the 
incident X-ray photon is considerable higher than the binding energy of the 
electron. As a result, the incident X-ray photon strikes an electron and ejects 
the electron from the atom. In Compton scattering, the incoming photon is 
deflected or scattered through an angle 0 with partial loss of its initial energy. 
'The incident photon transfers a portion of its energy to the electron, which 
is so called “recoil electron", or Compton electron. Therefore, the interaction 
produces a positive ion, a “recoil electron", and a scattered photon. The 
scattered photon may be deflected at any angle from 0 to 180 degree. After 
Compton interaction, most of the energy is retained by the scattered photon, 
corresponding to a small deflection angle. 


7.3.4 Rayleigh scattering 


Rayleigh scattering is a coherent process and is the predominant kind of 
scattering at low X-ray energies. It is caused by an interaction of the incident 
wave with several, usually outer shell electrons. A very low energy photon 
interacts with bounded orbital electrons of the atom. No ejection occurs, but 
the electrons and thus the whole atom is set to vibration with respect to the 
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Figure 7.10: Schematic principle of an image intensifier detector. The X-rays 
are first converted to light, which is converted to electrons. An optic acceler- 
ates the electrons towards a fluorescent screen which converts the electrons 
to light, which eventually results in an image. 


incident photon’s wavelength. The excess energy from the vibrating electron 
transfers to an electromagnetic photon which has the same wavelength but 
possibly a different direction than the incident photon. In this interaction, 
electrons from the material’s atom are not ejected and no energy is converted 
into kinetic energy, thus, no ionization occurs. Rayleigh scattered photons are 
mostly emitted in a forward direction with respect to the incident photon’s 
direction. For X-rays used for imaging, the contribution of Rayleigh scattering 
to total attenuation is usually small compared to other contributions. 


7.4 X-ray Imaging 


In the previous sections, the concepts of X-ray generation and also their 
interaction behavior with matter has been outlined. In this section, we will 
now focus on different detection methods used to convert the X-rays that have 
passed the patient to an actual image. Unlike the old X-ray films, which use 
X-rays directly to change the chemical properties of the X-ray film material, 
the modern detection systems first convert the X-rays to light and eventually 
to electrons. 


7.4.1 Image Intensifiers 


X-ray image intensifiers are vacuum tubes that are used to convert X-rays into 
visible light, i.e., an image. The schematic principle of this process is shown 
in Fig. 7.10. First, the incoming X-ray photons are converted to light photons 
using a phosphorus material called the input phosphor. The produced light is 
further converted to electrons by exploiting the photoelectric effect inside a 
photocathode. These electrons are then accelerated and focused towards the 
output phosphor using an electron optic system. In the output phosphor, the 
electrons are converted back to visible light which can then be captured by 
film material or television camera tubes. 
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Figure 7.11: Detailed principle of an image intensifier detector. The X-rays 
are first converted to light, which is converted to electrons. An optic focuses 
the electron beam to a fluorescent screen or film material which converts the 
electrons to light, i. e., the image. 


Before the introduction of image intensifiers in the late 1940s, fluoroscopic 
detection system consisted of only one phosphorus material where X-rays 
have been directly converted to light. However, the mismatch between the 
high amount of needed X-ray quanta and the low amount of emerging visible 
light quanta led to very dark images and high radiation exposure. Thus, the 
radiologists had to view the images in dark surroundings and after a certain 
time of dark-adaptation of their eyes. The biggest advantage of image inten- 
sifier systems is that the brightness of the output image was now adjustable 
by the amount of acceleration supplied by the electron optics. Modern X-ray 
image intensifiers have an input field diameter of about 15 to 57 cm. They 
are characterized by conversion factors that indicate how efficient X-rays are 
transformed to visible light. 


7.4.1.1 Function 


A more detailed overview of the individual parts of an image intensifier is 
given in Fig. 7.11. First the the incoming X-rays pass through the input 
window which typically consists of a convex shaped aluminum plate with 
a thickness of a approximately 1 mm. The convex shape is used to enhance 
mechanical stability but also to reduce the distance to the patient which 
effectively increases the useful entrance field size. 
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After passing through the input window, the X-rays hit the input phos- 
phor used to convert X-ray photons to light photons. The generated light 
photons trigger a photoelectric effect in the photocathode which then emits 
(photo-)electrons. The input phosphor and the photocathode are typically 
layered to one piece. Starting with the input phosphor that consists of another 
aluminum plate coated with the phosphor layer, followed by an intermediate 
layer and the photocathode layer. 

Let us focus on the input phosphor layer in more detail. One important 
property that influences the efficiency of the input phosphor layer is its thick- 
ness. The thicker the phosphor layer, the higher is its absorption, thus, more 
X-ray photons are absorbed and converted to light. Hence, less X-ray pho- 
tons are required which reduces radiation exposure to the patient. However, 
with increasing thickness also more light photons become scattered within 
the phosphor layer which effectively reduces the spatial resolution. 

Another property that is used to increase conversion factors is the chem- 
ical composition of the input phosphor material and its resulting mass at- 
tenuation coefficient. Ideally, the input phosphor’s attenuation coefficient is 
adjusted to the residual incoming X-ray spectrum. Initially, zinc-cadmium 
sulfide (ZnCdS) has been used as phosphoric material, which has been re- 
placed by cesium iodide (CsI) in modern detector systems. The advantages 
of CsI over ZnCdS are twofold. In Fig. 7.12 we illustrate the mass attenuation 
coefficient of CsI (dashed, dark blue line) and ZnCdS (dotted, light blue line) 
w.r.t. the photon energy. Additionally, the estimated spectral distribution of 
a typical X-ray spectrum after transmission through the patient is depicted as 
solid, orange line. The higher the overlapping area between attenuation char- 
acteristics and residual X-ray spectrum, the better its conversion efficiency. 
We can clearly see that the mass attenuation coefficient of CsI matches better 
to the expected residual X-ray spectrum and is thus favorable. 

Additionally, the manufacturing process of CsI allows to build the phos- 
phor layer as a collection of small and local cylindrical structures as indicated 
in Fig. 7.14. The cylindrical wires act as optical fibers which can steer the 
emitted light to the photocathode with a high spatial accuracy. Thus, scat- 
tering of the light photons within the phosphor material can be drastically 
reduced. In modern detectors, the input phosphor is about 300 pm to 500 um 
thick and can absorb up to 70% of the incoming X-ray photons. A single 
60keV X-ray photon can create up to 2600 light photons, where approxi- 
mately 62% reach the photocathode. 

The photocathode layer typically consists of antimony-cesium (SbCs3). 
Similar to the input photon layer, the incoming light should fit to the sen- 
sitivity spectrum given by the photocathode. Fig. 7.13 shows the sensitivity 
spectrum of an SbCs3 photocathode, together with the characteristic light 
spectra emitted from a CsI as well as a ZnCdS phospor layer. We can see 
that also here CsI seems to produce a light spectrum that matches better to 
the photocathode, hence, leading to a higher conversion efficiency from light 
photons to electrons. 
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Figure 7.14: Cesium-iodine layer has cylindrical structure and acts as opti- 
cal fibers. Thus, the scattering of the light photons is reduced significantly. 


After the electrons leave the photocathode, they are accellerated by the 
anode as shown in Fig. 7.11. Moreover, the accelerated electrons are focused 
onto the output phosphor using electrostatic fields produced by the electron 
optic. No additional electrons are induced into the system by this process, 
the existing electrons are merely accelerated and deflected. The increase of 
kinetic energy that originates of the acceleration process results in a higher 
number of light photons that are emitted when the electrons hit the output 
phosphor. Hence, the intensity or brightness of the output phosphor can 
be altered by a regulation of the acceleration voltage. The output phosphor 
consists typically of silver-activated zinc-cadmium sulfide (ZnCdS:Ag) and is 
very thin (4pm to 8pm). About 2000 light photons are generated for a single 
25 keV electron. Due to the fact that one electron is emitted by one light 
photon in the photocathode, this also represents an increased brightness by 
a factor of 2000. 
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7.4.1.2 Known Problems 


Besides common limitations that all imaging systems share, e. g., spatial res- 
olution and contrast ratio, image intensifier systems are most known for vi- 
gnetting and distortion artifacts. Vignetting, as described in Fig. 7.15, de- 
scribes a drop in brightness that occurs at the outer parts of the screen. It is 
caused by light scattering that deflects light photons in the output phosphor 
from the outer part of the phosphor to the inside. However, no scattering 
occurs from completely outside the material to the outer regions of the phos- 
phor, yielding an increased brightness at the central regions. Another common 
artifact is image distortion as indicated in Fig. 7.16. It is known that the elec- 
tron optics of image intensifiers is susceptible to external magnetic or electric 
fields. Even the earth’s magnetic field causes considerable distortions in the 
output image. To correct for distortion artifacts, regular calibration is needed 
where the distortion field is estimated by measuring predefined calibration 
objects. The distortion can be corrected by either adjusting the electron op- 
tics accordingly or by subsequent image processing in case the images have 
been digitized. 


7.4.2 Flat Panel Detectors 


In the recent years, flat panel detector (FPD) became the state-of-the-art 
in X-ray detector technology for radiography, angiography, and C-arm CT 
applications. They were first introduced in the mid 1990s and their main 
advantages are a direct digital readout of the X-ray image and an increased 
spatial resolution. Flat panel detectors can be categorized into direct and 
indirectly conversion FPDs. 
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Indirect Conversion FPDs 


Similar to the image intensifier system discussed in the previous section, the 
FPD still converts X-rays to light photons by using a layer of cesium iodide 
(CsI). Also the tubular structure of the CsI is identical to the input layer 
of an image intensifier system as shown in Fig. 7.14. The major difference 
are the subsequent detection steps. Image intensifiers make use of a further 
conversion of light photons to electrons which are then accelerated to increase 
and control illumination. This additional conversion step is not necessary for 
flat panel detectors. Instead a matrix of photodiodes is directly attached to 
the CsI layer and converts the emitted light photons to an electric charge 
which is then stored in capacitors for each pixel. Each pixel also contains a 
thin-film transistor (TFT) which acts as small “switch” used for the readout 
of the stored charges. 


Direct Conversion FPDs 


Instead of an explicit conversion to light photons, direct conversion FPDs 
have a homogeneous layer of X-ray sensitive photoconducters on the TFT 
matrix. The top layer is a high-voltage bias electrode that builds an electric 
field across the photoconductor. If X-rays are absorbed by the photoconduc- 
tor, so called charge-carriers are released, i. e., electron-hole pairs. These pairs 
are then separated to negative and positive charges and transported to the 
pixel’s electrodes by the global electric field. Positive charges travel to the 
bottom of the individual pixel electrodes where they are stored in capacitors. 


Data Readout and Properties 


For both the indirect but also the direct conversion FPDs, the readout of the 
pixels is done row-wise using a certain readout frequency. A row is selected 
by “switching on” the TFTs of this row’s pixels, i. e., by applying a voltage to 
the gate of the TFTs. The stored charges of each pixel are directed to a charge 
integrating amplifier and subsequently converted to a digital representation. 
These digital pixel values are serialized and transferred over a bus system to 
the imaging computer. Common FPDs for medical imaging can have a side 
length of up to 40 cm and a pixel size of about 100 pm to 150 pm. They 
are available in quadratic but also in wide formats. The analog to digital 
conversion uses a quantization of 12 to 16 Bit. To increase the signal to noise 
ratio multiple pixels are often combined to a bigger pixel during the readout 
process, which is also known as binning. Typical binning modes are 2x2 or 4x4 
binning, reducing the image size by a factor of 2 or 4, respectively. Because 
binning does not require any additional time, the frame rate increases by 
the binning factor. Frame rates typically vary between 7.5 and 30 frames 
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per second, depending on the medical application, dose requirements, and 
binning factors. 

Major advantages of flat panel detectors are a significant reduction of 
space and weight needed for the detection unit. This may sound trivial but 
the benefit becomes more clear when we consider that space is typically 
limited, especially in an interventional environment and that increased weight 
is directly related to rotation speeds of CT or C-arm CT devices. Another 
advantage is the robustness against (moderate) electrical and magnetic fields, 
which posed a huge problem for image intensifiers. Moreover, the images are 
directly available in digital form, which makes patient handling and data 
storage more efficient. 


7.4.3 Sources of Noise 


There are two types of undesirable effects in medical imaging systems: prob- 
abilistic noise and artifacts. Similar to noise, artifacts are image degradations 
that also find their source in physical effects during the scan. However, the 
difference to noise is that when a scan is repeated using the exact same object 
and scan parameters, artifacts are reproduced exactly whereas noise effects 
will change based on a probabilistic scheme. Some artifacts, for example, 
distortion and vignetting, have already been shown in the section Sec. 7.4.1 
on image intensifier detectors. In the following, we focus on the sources and 
propagation of noise in X-ray imaging. 

As illustrated in Fig. 7.17, there are different states of an X-ray pho- 
ton. Each step in this chain follows either a Poisson distribution (cf. Geek 
Box 7.3) or a binomial distribution (cf. Geek Box 7.4). In Fig. 7.18, we show 
both distributions in comparison. The X-ray photon generation process (cf. 
Geek Box 7.5) follows a Poission distribution. The matter interaction and 
the detection step (cf. Geek Box 7.6) follow a binomial distribution. Both 
processes interact along the path of the X-ray (cf. Geek Box 7.7) resulting 
in yet another Poisson distribution. As such Lambert-Beer’s law also has a 
probabilistic interpretation (cf. Geek Box 7.8) and every observation on the 
detector is Poisson distributed in the monochromatic case. 

A common quality measure for imaging is the signal-to-noise ratio (SNR). 
It is not uniquely defined over different fields of applications. In X-ray imaging 
it makes sense to use the definition based on statistics, i. e., 


SNR(N) — 7 E UM (7.6) 


For random variables M that follow a normal distribution, n is the mean 
value and o represents the standard deviation. More generally speaking, the 
two variables define the first moment (7) and the second central moment (c) 
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Figure 7.17: Overview of noise related processes in X-ray imaging. 
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Figure 7.18: Mass distribution functions of Poisson and binomial distribu- 
tions. 


of the underlying distribution. The first moment of the Poisson distribution is 
given by its expectation value n = E(N), whereas the second central moment 
is the square root of the expectation value of the squared difference between 
the random variable and its expectation value c = \/E((N — n)?). Hence, no 
matter what distribution, c provides a measure of variation, i. e., a measure of 
noise. As a result, the SNR gives a measure for the signal quality by dividing 
the expectation value with the second central moment. If the measured data 
would not contain any noise, 0 would be zero and the SNR would approach 
infinity. If the noise level increases, also c increases, thus the SNR decreases. 
The expectation value 5» in the numerator makes the SNR stable to scaling, 
that means if we measure very high values at the detector a small amount 
of noise is less critical as if we measure small values that contain the same 
amount of noise. For X-rays, we can demonstrate that the SNR(N) e VNo 
(cf. Geek Box 7.9). As a consequence, SNR only doubles if we use four times 
as many photons for No. Note this estimation is simplified and neglects some 
effects such as detector read-out noise. 
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Geek Box 7.3: Poisson Distribution 


The Poisson distribution is a discrete probability distribution and its 
mass distribution function is defined by 


Poisson(No) = p(.N = "m PEE (7.7) 


where N, is the expectation value of the observed event E(N). We 
now show a simple example for the usage of the Poisson distribu- 
tion. Assume a local shop records its daily number of customers for 
a year which results in an average of N, = 15 customers per day. 


The Poisson distribution can now be used to calculate the probability 
that on a new day there will be n = 20 customers in the shop, i.e., 
pN = 20) = Xo)” eM = 15° 6-15 ~ 0.0418. In Fig. 7.18(a), the 
mass distribution function as defined in Eq. (7.7) is shown for three 
different expectation values N,. If the number of N, becomes high, 
the Poisson distribution approaches a normal distribution with mean 
n = N, and standard deviation o = \/N,. This is based on the so 
called “central limit theorem”. In Fig. 7.18(a), we have also added the 
corresponding mass distribution functions for each Poisson distribu- 
tion. You can clearly see that the higher N,, the closer the discrete 
Poisson distribution gets to a normal distribution. 


7.5 X-ray Applications 


7.5.1 Radiography 


Radiography describes the process of creating two dimensional projection 
images by exposing an anatomy of interest to X-rays and measuring the at- 
tenuation they undergo when passing through the object. It is a very common 
form of X-ray imaging and is used in clinics around the globe. 

The main application area is the examination of fractures and changes of 
the skeletal system. Here, the high attenuation coefficient of bones compared 
to the surrounding tissue delivers a good contrast and allows for distinct 
detection and classification of fractures. Moreover, radiography can be used to 
detect changes of a bone’s consistency or density, e. g., in case of osteoporosis 
or bone cancer. In Fig. 7.19, two X-ray images of an arm with fractures of 
Ulna and Radius bones are shown on the left. Furthermore, the figure shows a 
color image taken from the arm after intervention and also two further X-ray 
images of the treated arm where the bones have been internally fixated using 
metal plates. 
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Geek Box 7.4: Binomial Distribution 


The binomial distribution is also a discrete distribution and can be 
used to model a series of Bernoulli trials, i.e., a series of random 
experiments with binary outcome. The mass distribution function is 
given by 


pwr =n) = (T) any" - php" ss) 


It describes the probability that exactly n positive outcomes occur 
in a series of N independent trials, where p denotes the probability 
for an individual trial having a positive outcome. The most intuitive 
example for a binomial distribution is coin tossing. From coin tossing 
we know that the probability of getting head or tails in a single toss 
is “fifty-fifty”, i.e., p = 0.5. If we now want to know the probability to 
get exactly n = 20 times heads when you toss the coin N = 30 times, 
p(N = 20) = (7) p" (1 — 2)" ^ = e (0.5)? (1 — 0.5)90720 a 
0.028. In Fig. 7.18(b), three cases of a binomial mass distribution 
function are shown for a varying number of trials N. The probability 
of a single trial being true was fixed with p — 0.5. 


Geek Box 7.5: Statistics of the X-ray Generation Process 


How is the Poisson distribution related to X-ray imaging? It can be 
shown that the number of generated X-ray photons at the anode is 
Poisson distributed. As input parameters we have the number of ac- 
celerated fast electrons Ne and the probability for one fast electron 
being converted to an X-ray photon pex. The distribution is then given 
by 


N, = Ne Pez 


P(N = n) = Nees)” gN pea 


3 ; (7.9) 


where N, denotes the expected value for the number of electrons that 
trigger an X-ray photon, which is also known as a measure for the 
radiation intensity. We use P(N,) as the probability that an X-ray 
source produces exactly N, X-ray photons. P(N,) is then given by 
above equation, where n has been replaced by N,. 


140 7 X-ray Imaging 


Geek Box 7.6: Statistics of the X-ray Matter Interaction 


The generated X-ray photons are now traveling through space towards 
the detector and interact with the matter they pass through according 
to Beer’s law as introduced in Eq. (7.3). Whether the photons interact 
with the matter or pass through it unaffected can be interpreted as 
Bernoulli trial, i.e., an experiment with random binary outcome. If 
a single photon encounters an interaction depends on the material 
properties along the photons path, i.e., the attenuation p(x). The 
probability pg for the photon passing unaffected is again given by 
Beer's law: 

pce Hea (7.10) 


As the individual X-ray photons are independent from each other in 
terms of interacting or not, the process can be described as a binomial 
distribution. Furthermore, it can be shown that when we have a Pois- 
son distributed variable (N,) that represents the number of samples 
in a binomial distribution, the outcome is again Poisson distributed. 
We refer to Geek Box 7.7 for further information. 


Geek Box 7.7: Combining Poisson and Binomial Distribution 


As the individual X-ray photons are independent from each other in 
terms of interacting or not, the process can be described as a binomial 
distribution 


` IONS : P(ns|Nz) 
Ne= ia 


(N, Pa)” e NoPa 
ns! 


where Ny is the number of X-ray photons, ns is the number of pho- 
tons that pass through the object unaffected, P(N,) is the probability 
that the X-ray generation produces N, photons and P(n,|N,) is the 
conditional probability that models the number of unaffected photons, 
given the number of input photons. The sum comes from the “Law of 
'Total Probability" and is necessary to eliminate the conditional prob- 
ability. P(N = ns) now gives the overall probability that ns photons 
will arrive at the detector after having passed the object. We refer to 
[1, p. 65] for a detailed derivation. 


7.5 X-ray Applications 141 


Geek Box 7.8: Probabilistic Lambert-Beer Law 


The resulting distribution for the number of photons after matter 
interaction is thus given by 


(N, Pa)” e NoPa 
ne! 


; (7.11) 


where ns is the number of photons that pass through the matter un- 
affected. We can also determine the expectation value of this Poisson 
distribution, i. e., 


Bln] = N, pa = M, e- J "e (irm 


Hence, the expectation value as given by Eq. (7.12) is again Lambert- 
Beer's law which was introduced in Sec. 7.3. 

Note that each process that now follows in the X-ray detection step 
can also be modeled or at least approximated by a binomial distribu- 
tion. This holds for the conversion of X-ray photons to light photons 
in scintillator based detectors, for the subsequent conversion of light 
photons to electrons but also for the conversion of X-ray's to electrical 
charges in direct conversion flat panel detectors. Each of these steps 
yields another Poisson distribution for the number of outgoing pho- 
tons or electrons, hence also the final value at the end of the detection 
step follows a Poisson distribution. This observation is crucial, as it 
tells us that all of the measurements that we get from our detection 
system follow Poisson distributions including its behavior regarding 
noise which will be discussed next. 


7.5.2 Fluoroscopy 


Conventional radiography typically refers to the acquisition of a single or 
small number of X-ray projection images for a specified view. In contrast, 
fluoroscopy describes a sequence of radiographic images acquired periodi- 
cally at a certain frame rate. The X-ray source can either be triggered for 
each frame or simply provide a constant radiation exposure to the region of 
interest. Potential X-ray detectors can be image intensifiers (see Sec. 7.4.1) 
or the newer FPDs (see Sec. 7.4.2). The frame rate is typically limited by the 
acquisition speed of the detection system. For image intensifiers, it is given by 
the inertia of the final fluorescent screen, whereas for FPDs it is determined 
by the speed of the electronic detector readout step. In practice, frame rates 
of 30 frames per second are possible. However, rates are often reduced for 
dose reasons. 
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Geek Box 7.9: Signal to Noise Ratio for X-rays 


Let us focus on the distribution for the number of photons that ar- 
rive at the detector, i.e., ns. The distribution is given by Eq. (7.11). 
From Poisson statistics we know that the expectation value is the only 
parameter of the mass distribution function, which was already com- 
puted in Eq. (7.12), hence, n = N, pa. Further, it can be shown that 
the variance (0?) of a Poisson distribution is equal to the expectation 
value, i.e., o? = N, pa. The SNR after photon interaction with the 
object can then be computed by 


(7.13) 


Above equation shows us that the SNR of an X-ray imaging system 


IOLE and also 


of course depends on the object, represented by e 
on the number of generated photons at the source N,. The SNR is 
proportional to the square root of the number of emitted X-ray pho- 
tons, hence increasing the number of photons also increases the SNR. 
However, a higher number of photons also means a higher dose level 
for the patient which is often the limiting factor and gives an upper 
bound for the SNR. 


Fluoroscopy is of special importance in minimally invasive interventions, 
where catheters, endoscopes, and other tools need to be guided and operated 
without direct visual contact to the region where the actual intervention takes 
place. It is also the key technology for visualizing vessels such as arteries or 
veins by the use of contrast agent, as described in the following section. In 
Fig. 7.20(a), an example image from a fluoroscopy sequence is depicted that 
shows the placement of the two electrodes and wires of a heart pacemaker. A 
typical clinical setup for a minimally invasive surgery is shown in Fig. 7.20(b), 
where the X-ray imaging unit is given by a C-arm scanner that can be freely 
positioned around the patient. 
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Figure 7.19: Arm showing fractures in radiographic images, the correspond- 
ing color image after surgery and the radiographic images after fixation of 
the bones using metal rods. Image is in public domain and was taken from 
[11]. 


(a) Pacemaker Fluroscopy. Image (b) Clinical Setup. Image courtesy of Siemens 
taken from [5]. Healthineers AG. 


Figure 7.20: Left: Image from a fluoroscopy sequence showing the placement 
of two electrodes of a pacemaker. Right: Typical clinical setup of a minimally 
invasive surgery. The fluoroscopy is acquired using a freely positionable C- 
arm device. The image shows the X-ray source at the bottom and the FPD 
is right above the patient. 


7.5.3 Digital Subtraction Angiography 


Angiography refers to the imaging of arteries (venography for veins) to an- 
alyze properties such as shape, size, lumen, or flow rate. Usually, the at- 
tenuation properties of vessels do not substantially differ from that of the 
surrounding tissue which makes X-ray-based imaging hard and yields poor 
contrast. 

To increase image quality and contrast often contrast agent is injected into 
the blood circulation. Contrast agent is a liquid that provides an increased at- 
tenuation coefficient compared to normal soft tissue. Typical contrast media 
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Figure 7.21: Process of creating a DSA. In (a) the hand was imaged and 
no contrast agent has been injected (mask image). In (b) the same hand has 
been imaged but including injected contrast agent. The difference of (b) and 
(a) represents the angiogram as shown in Figure (c). Images provided by 
Adam Galant, Siemens Healthineers AG. 


are iodine and barium, where the first is used for intravascular and the latter 
for gastrointestinal examinations. Thus, iodine is injected into the blood cir- 
culation whereas barium can be swallowed to investigate, e.g., the stomach 
or colon. 

In DSA, a fluroscopic sequence of a fixed anatomy is acquired. At the 
same time contrast agent is injected in regular intervals into the vessel sys- 
tem. X-ray images that have acquired the scene without contrast agent are 
assumed to show the background tissue that is typically not of interest. If we 
now subtract the initially acquired background image from an X-ray image 
with contrast and assume that no patient motion has taken place, we can 
measure the attenuation caused only by the injected contrast agent. As the 
contrast agent is limited to the vessel system, it has been injected to, the 
outcome of such a subtraction will be a visualization of the vessels only. In 
Fig. 7.21, an example for a DSA acquisition is presented. First the contrast 
agent free image, i.e., the mask image, is acquired as shown in Fig. 7.21(a). 
Then contrast is injected into the vascular system and after some waiting a 
further image, i.e., the fill image, is acquired. The difference of both images 
is then the so called angiogram that shows only the contributions given by 
the contrast agent and thus the vessels are visualized (cf. Fig. 7.21(c)). 
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8.1 Introduction 


C'T is doubtlessly one of the most important technologies in medical imaging 
and offers us views inside the human body that are as valuable to physicians 
as they are fascinating (cf. Fig. 8.1). 


8.1.1 Motivation 


In the previous chapter, we have seen how X-rays can be used to acquire 
2-D projection images. However, a single projection image does not retain all 
spatial information, as it merely shows something akin to “shadows” of the 
imaged objects. An example is given in Fig. 8.3(a), which shows an X-ray 
projection image of a luggage bag. Two arrows indicate objects that cannot 
easily be identified. Using multiple projection images from different angles, we 
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Figure 8.2: The first clinical 
CT scan, acquired October 1971 
at Atkinson Morley’s Hospital in 
London. 


Figure 8.1: Volume rendering of 
a CT head scan. Image courtesy of 
Siemens Healthineers AG. 


Figure 8.3: 2-D X-ray projection image of a luggage bag (a) and a corre- 
sponding 3-D reconstruction, visualized with a volume rendering technique 
and (b, c) and as orthogonal cross-sectional slices (c). 1) Indicates a hidden 
text revealed in (b) and 2) an apple that is virtually sliced in (c). Images 
courtesy of Chris Schwemmer. 


are able to perform a 3-D reconstruction and obtain cross-sectional views of 
the objects. Looking at the reconstructed volume, we can read the letters on 
the bag (Fig. 8.3(b)) and recognize the bright object as an apple (Fig. 8.3(c)). 


8.1.2 Brief History 


In 1917, Johann Radon published an article about “the determination of 
functions by their integrals along certain manifolds,” which would not find a 
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practical application for the following 50 years. The main concepts introduced 
in his article will be outlined in Sec. 8.2 and used in Sec. 8.3 to explain how 
CT image reconstruction works. 

Only in 1971, the first CT system was built by Sir Godfrey Newbold 
Hounsfield and Allan McLeod Cormack!. Fig. 8.2 shows the result of the 
first clinical scan of a patient's head performed in the same year. For their 
seminal invention, they received the Nobel prize in medicine in 1979. 

A major advance in the field was the introduction of spiral (or, more 
accurately, helical) CT by Willi Kalender et al. in 1990. Its name is derived 
from the novel aspect of its acquisition trajectory describing a helix. Amongst 
others, this geometry and its advantages will be described in Sec. 8.3.3. 

In the early days of CT imaging, data acquisition was fairly slow, taking 
approximately 4minutes per rotation. The reconstruction of a single 2-D 
slice with a low spatial resolution of 80 x 80 pixels and 3 bit quantization 
took several hours. By 2002, rotation speed had improved drastically with 
one rotation performed in only 0.4seconds. Up to 16 slices in parallel could 
be reconstructed on-the-fly, at a higher resolution of 512 x 512 pixels and a 
quantization depth of 16 bit. 

In recent years, this trend continued with temporal and spatial resolutions 
constantly improving. In this context, the development of dual source CT 
in 2005 was another significant milestone, featuring two X-ray sources and 
detectors in a single scanner. In addition to offering additional information 
when both employed X-ray tubes are operated at different voltages (dual 
energy scan), it can also be used to speed up the acquisition significantly. 
'The amount of slices acquired in parallel had also increased further, covering 
a field of view measuring up to 16 cm in axial direction at voxel sizes below 
one millimeter. This allows imaging of complete organs such as the heart in 
a single rotation, thus reducing motion artifacts. 

In 2014, a modern CT system (cf. Fig. 8.4) was able to acquire up to 128 
slices in parallel at a temporal resolution as low as 195 ms with a single X-ray 
source. 


8.2 Mathematical Principles 


In this section, we will first introduce the Radon transform as the underly- 
ing mathematical principle of the image formation process in CT imaging. 
Inverting this transform is the fundamental problem solved by image recon- 
struction methods. Subsequently, we will detail the Fourier slice theorem, a 


1 Both researchers were working for EMI at the time, a British music recording and 
publishing company well known for housing the Beatles’ label. This has spawned a 
widespread belief that the Beatles' success contributed to financing the initial develop- 
ment of CT. 
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pe 


Figure 8.4: At the time of writing, a modern CT scanner can acquire up to 
128 image slices in parallel. Image courtesy of Siemens Healthineers AG. 


property related to the Radon transform that constitutes the core idea of an 
important class of reconstruction algorithms. 


8.2.1 Radon Transform 


Radon’s key insight was that any integrable function f(x,y) can be uniquely 
represented by — and therefore recovered from - all straight line integrals over 
its domain, 


+00 
xo - [ f(x(D,u()) dl, Vl: (z(),y(D)! € line £ (8.1) 


In order to write down all of these integrals without duplicates, a representa- 
tion of the lines is needed that describes each one uniquely. For this purpose, 
we can formulate Eq. (8.1) in terms of polar coordinates, 


p(0,s) = "n f(x, y)ó(x cos 0 + ysin0 — s) dady, (8.2) 


with 0 the angle between the line's normal vector and the x-axis and s the 
orthogonal distance between line and origin (cf. Fig. 8.5). Implicitly, this line 
is described by the equation xcos@ + ysin@ = s. Only those points that 
satisfy it, i.e., those that fall on the line, are selected by the Dirac function 6 
in Eq. (8.2), as it vanishes everywhere else (cf. Eq. (2.10)). In that way, the 
integration of f(x,y) is only performed along the respective line. 

The complete set of line integrals p(0, s) can now be obtained by going 
through the angles 0 € [0?, 180?] and distances s € [—oo, +00]. Apart from 
orientation which has no influence on the integration, any other line would be 
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xcosĝô + ysinü—s 


Figure 8.5: The blue line is uniquely described by its distance s to the 
origin and the angle 0 which defines its normal vector (cos 0, sin 0)! . This 
representation immediately gives rise to the implicit line equation. 

ay $4 
f(x,y) i (9, s) 


- : — | 
S Radon Transform 


vo 


0° 180° 


pe(s) --------------------/⁄ 


Figure 8.6: f(x,y) has a constant non-zero value inside the blue circle and 
vanishes everywhere else. We see a single projection on the left and where it 
fits into the whole sinogram on the right. 


equivalent to one of these. For a fixed angle 0, the 1-D function pọ(s) = p(0, s) 
is called a projection. It contains all line integrals over f(x,y) with a constant 
angle 0 and variable distance s to the origin. Arranging all projections side- 
by-side as a 2-D image yields the sinogram. It owes its name to the sinusoidal 
curves emerging from the underlying geometry. We can see that every point 
in 2-D except for the origin is found at different distances along the s-axis 
depending on the angle 0. An example of a sinogram is given in Fig. 8.6. 

Turning the function values f(z, y) into line integral values p(0, s) is known 
as the Radon transform in 2-D. The aim of CT reconstruction is the com- 
putation of the original function values from measured line integral values, 
i.e., the inverse Radon transform. 
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Figure 8.7: The Fourier slice theorem establishes an equivalence between 
the Fourier transform P(£,0) of the projection pe(s) and a line in the Fourier 
transform F(u, v) of f (x, y) which runs through the origin and forms the angle 
0 with the u-axis. Please note that in the frequency domain images on the 
right, the magnitudes of the complex numbers were plotted on a logarithmic 
scale for improved readability. 


8.2.2 Fourier Slice Theorem 


While it is not immediately clear how to invert the process of projection, we 
can take a detour through frequency domain. Fig. 8.7 depicts the principle 
behind the Fourier slice theorem by establishing relationships between the 
relevant domains. We start by computing the 1-D Fourier transform P(£,0) 
of the projection pg(s). The Fourier slice theorem establishes an equivalence 
that exists between P(£,0) of the projection pg(s) and a line in the Fourier 
transform F(u, v) of f(x, y) which runs through the origin and forms the angle 
0 with the u-axis. À proof of this property can be found in Geek Box 8.1. An 
intuitive visualization of this relation is displayed in Fig. 8.8. Computation of 
one 2-D Fourier coefficient is equivalent to projecting the image first, followed 
by a correlation with the respective 1-D frequency. This is possible as Fourier 
transform and projection operate in orthogonal direction and are therefore 
separable as shown in Fig. 8.9. With the complete set of projections, we get 
many such lines and therefore obtain a good estimate of F(u, v). An inverse 
2-D Fourier transform then leads us back to the desired function f(x, y). 
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Geek Box 8.1: Fourier Slice Theorem 


An essential relationship between the projections pg(s) and the func- 
tion f(x,y) can be established by looking at the frequency domain 
representations 


F(u,v) = F(f(z.y)) (8.3) 
P(6,0) = F {po(s)} , (8.4) 
using the Fourier transform F (cf. Sec. 2.3 (p. 22)). As illustrated in 
Fig. 8.7, P(£,0) is equivalent to the part of F(u,v) that falls on a 


radial line with angle 0. To see why this is the case, we start with the 
1-D Fourier transform P(E, 0) of po(s), 


+00 
P(é,0) = i Deu (8.5) 


Using the definition of the projection pe(s) from the previous section, 
we obtain 


+00 Foo 
P(£,0) = il / / f(x, y)ó(z cos 0 + ysin0 — s) dedy e ?7555 ds. 


(8.6) 
Rearranging the order of the integrals yields 


+00 +00 
P(é,6) = | il f(x,y) l (x cos 6 + ysin0 — s)e- 7755 dsdzdy. 
(8.7) 


Eliminating the delta function reads as 


Too 
PEAS a a cq, (a8) 


Variable substitution yields the definition of the 2-D Fourier trans- 
form, 


Foo 
P(£,0) — // Fees Oe) yee, v—tsing dzdy, (8.9) 
—oo 
which finally results in the proposed theorem, 
TE 0) = F(E cos 0, Esin 0) = iei ss 6). (8.10) 


In effect, we can get the complete Fourier transform Fpolar(Ẹ, 0) of the 
unknown function f(x,y) in polar coordinates (£,0) by varying 0. 
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2-D Fourier Transform 
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Projection 1-D Fourier Transform 


Figure 8.8: Graphical visualization of the Fourier slice theorem. In fact, com- 
putation of the projection and correlation with a sinusoidal function (dashed 
lines) is equivalent to a 2-D correlation with the respective Fourier base func- 
tion (dotted lines, cf. Fig. 6.9). 


NN 


Figure 8.9: A close look at the Fourier base functions reveals that they are 
actually computing an integration along the wave front. As such projection 
and convolution operate in orthogonal domains and can therefore be sepa- 
rated into a projection and a 1-D correlation, i.e., a 1-D Fourier transform. 
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Material / Tissue HU 


Air —1000 

Lung —600 to —400 
Fat —100 to —60 
Water 0 

Muscle 10 to 40 
Blood 30 to 45 

Soft tissue 40 to 80 


Bone 400 to 3000 


Table 8.1: HUs observed in several materials and tissue classes found in the 
human body. In general, denser structures exhibit larger HUs. 


8.3 Image Reconstruction 


As described in Sec. 7.3 (p.125), X-ray projections can be converted to line 
integrals using Beer’s law, which enables us to apply Radon’s ideas to CT 
reconstruction. A single slice of our imaged object corresponds to the bivariate 
function f(x,y). More precisely, the function values reconstructed in CT are 
the linear attenuation coefficients of the imaged material. Typically, they are 
linearly transformed to the Hounsfield scale, which is normalized such that 
the absorption of water equals 0 HU, 


= ( eri 1) - 1000, (8.11) 
LWater 


where u and p* denote the coefficients before and after Hounsfield scaling, 
respectively. Tab. 8.1 lists the HU ranges of several tissue classes found in 
the human body. 

Below, we will discuss the two main methods for 2-D image reconstruc- 
tion from parallel-beam projections as they have been introduced above. In 
conventional CT imaging, a 3-D image volume is then obtained simply by 
acquiring and reconstructing multiple axial slices at slightly offset locations 
such that they can be stacked on top of each other (Fig. 8.10). 


8.3.1 Analytic Reconstruction 


Using the Fourier slice theorem, we can derive an analytic reconstruction 
method known as filtered back-projection. 
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Figure 8.10: In conventional CT, a 3-D image of the body is formed by 
acquiring, reconstructing, and subsequently stacking 2-D image slices in axial 
direction. For each slice, all projection rays lie in a plane, which is why we 
only deal with bivariate functions f(x,y). However, be aware that there are 
other geometries where this assumption is no longer valid (cf. Sec. 8.3.3). 


8.3.1.1 Filtered Back-Projection 


It is possible to invert the process of projection directly, without explicitly 
computing the computations in frequency space suggested by Fig. 8.7. In 
Geek Box 8.2, it is shown that the required calculations reduce to 


fey) = n pols) * h(s)|s—a cos 0-+y sin 6 dé, (8.12) 
0 


where h(s) corresponds to the inverse Fourier transform of |é|. This amounts 
to the back-projection of pg(s) convolved with h(s). As a consequence, this 
method is called filtered back-projection. 

Unfiltered back-projection, i.e., just “smearing” line integrals in a projec- 
tion pg(s) back along their corresponding lines without filtering (cf. Fig. 8.12), 
is equivalent to adding P(£,0) to F(u,v), as suggested by the Fourier slice 
theorem, without considering the factor |£|. In fact, this is not the inverse, 
but the dual or adjoint of the Radon transform. 


8.3.1.2 Filters 


Due to the shape of |£], the filter h(s) is typically called ramp filter. Sam- 
pling in polar coordinates leads to an oversampling in the center of the Fourier 
space (cf. Fig. 8.11). Using a ramp filter, this oversampling is corrected by en- 
hancing the high frequency components while dampening the low frequencies 
in the center of the Fourier space. 

According to the sampling theorem, as described in Sec. 2.4.2 (p. 32), with 
a detector spacing of As, the largest frequency that can be detected in pe(s) is 
Emax = ZA: Additionally, noise in the projections is amplified when using an 
unlimited ramp filter |£|. Therefore, high frequencies should be limited in the 
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Geek Box 8.2: Filtered Back-Projection 


To derive the filtered back-projection algorithm, we start with the 
inverse Fourier transform of F(u, v), 


Too Foo i 
f(,y) =| Í F(u, v)e?ri(uetv9) dudv. 


We can rewrite this equation in polar coordinates Fpolar(£, 0) by sub- 
stituting u = €cos@ and v = £sin0. A change in integration variables 
requires an additional “correction” factor in the integral. This fac- 
tor is the absolute value of the determinant of the transformation's 
Jacobian matrix J: 


du du à 

de de 0) —€sin(0 
eco eaten S ee ur 

a a sin(8) £cos(0) 


= |E cos? (0) + €sin?(6)| = £l. 
'Therefore, performing the change in coordinates, we obtain 
T +oo . , 
f(x,y) = i / Eg (elect Cr cupo amy d£d6. 
0 —oo 


From the Fourier slice theorem, we know F(£,0) = P(£,0), thus 


TOT +oo 
NOE i / P(E, 6)|e]e?ri£ cos rus) qeqg, 
0 —oo 


Replacing x cos 0 + ysin 0 with s, this reads as 


T +co F 
Asa / / P(E, 6)|e|e?r** aede, 


which contains a product of P(£,0) and |£|. A product in Fourier space 
corresponds to a convolution in spatial domain (cf. Sec. 2.3.2 (p. 25)). 
If we denote the inverse Fourier transform of |£| with A(s), we find the 
following spatial domain representation: 


dom zx / po(s) 5 h(s)|s—a cos 0+y sin 0 dé. (8.13) 


This amounts to the back-projection of pg(s) convolved with the filter 
kernel h(s). 
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F(u, v) 


Figure 8.11: Sampling in polar 
coordinates causes the density of 
samples to increase with proximity 
to the origin, whereas the more dis- 
tant areas are under-represented. 
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bo(z, y) = Po(S)|s—x cos 6-+y sino 


4Y 


Figure 8.12: The back-projection 
bo(x, y) of a single projection pe(s) 
hardly gives us an idea of the 
original function f(x,y). However, 
we can reconstruct it by back- 


projecting a sufficient set of appro- 
priately filtered projections. 


filtered projections fe(s). For this purpose, we can generalize the following 


equation 
-Foo 


pe(s) = P(E, 0)|£|e?"** a£, (8.14) 


— oo 


by replacing |£| with an arbitrary filter H(£): 


+00 
fs(s) = / P(E, 0) H(£)e?"** a£, (845) 


— oo 


In practice, various ramp-like filters are used depending on the desired im- 
age characteristics, typically involving a trade-off between a smoother image 
appearance and a higher spatial resolution. 

One of the most widely known filters was described by Ramachandran and 
Lakshminarayanan, in short known as the *Ram-Lak" filter. It corresponds 
to |£| cut off at Emax on both sides. In the spatial domain, this results in a 
filter kernel that reads 


_ sinc (4) sinc? (54,) 
h(s) = As)? As)? ; (8.16) 


a discretized version of which can be convolved with the discrete projection. 
A derivation of the Ram-Lak filter is given in Geek Box 8.3. 
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Geek Box 8.3: Ram-Lak Filter 


In order to derive the Ram-Lak filter, we need to start with the inverse 
Fourier transform of |£| 


elem ae. 


— oo 


Now, we introduce a band limitation that only allows frequencies |£| < 


B oo 
[iem ag =f ient rect (CE) a 


Note the we use the rectangular function (cf. Tab. 2.2) to express the 
band limitation above. Furthermore, we can also use this function to 
express |£| as the convolution of two rectangular functions yields a 
triangular function (cf. Tab. 2.2): 


|£| = B — rect ($) * rect ($) 


Be 


Now the band-limited inverse Fourier transform of || takes the fol- 
lowing form: 


h(s) =F |(B — rect ($) » rect ($)) rect (S) 
=F |B rect (35 )| - F~ | (rect (4) «rect ($)) ey 


support on [- B,B] =1 on [- B,B] 


-r [oe ()] 77 [e (8)] 77 be (4) 


=2B? sinc(2Bs) — B? sinc?(Bs) 


With B = Emax = x45, we arrive exactly at Eq. (8.16). In the discrete 
case, s needs to be an integer number. Thus, we can simplify above 
equation even further to: 


1 
24s? s=0 
0 s even 


T2 TAs s odd 
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Ram-Lak Shepp-Logan 
H(w) H(w) 
W W 
—Umax Wmax —Umax Wmax 
h(s) h(s) 
S S 


Figure 8.13: The responses in frequency domain (top row) as well as the 
discretized spatial domain kernels (bottom row) of the Ram-Lak and Shepp- 
Logan filters. 


To suppress noise, a windowing function can be multiplied with the filter 
in frequency domain which lowers its response for frequencies close to Emax- 
In the case of the commonly used filter proposed by Shepp and Logan, this 


windowing function is 
sinc (<<) | : (8.17) 


This leads to a slightly different function h(s) which can be discretized in the 
same manner. Fig. 8.13 shows plots of the Ram-Lak and Shepp-Logan filters. 


8.3.1.3 Discretization 


Through discretization, the convolution 
+00 
Bols) = I po(s) - h(s — s") ds’ (8.18) 


becomes 
Bos = V pos hs—s' As. (8.19) 
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Figure 8.14: Example for an image grid and some projection rays. 


An example for a discrete filter hs is given in Geek Box 8.3. 
With an angle increment of d = +, where N is the number of acquired 
projections, the final back-projection step in Eq. (8.13) can be written as 


T = 
f(z, y) = N XO Po, (s) jose cos 0;--y sin 0;: (8.20) 


where 0; denotes the itè angle. Note that the value po, (s) will generally have 
to be interpolated from the fe, s since s is not necessarily an integer number. 
For each position (x, y), we can find f(x, y) by summing over corresponding 
(interpolated) values of each filtered projection jg,. As a rule of thumb, it is 
recommended to avoid interpolation in the output space, i. e., in our case, we 
should sample f directly. T'his comes naturally with the formulation given in 
Eq. (8.20). In contrast, back-projecting one po, at a time to the whole volume 
would require interpolation on grid points in the domain of f in each step. 


8.3.2 Algebraic Reconstruction 


A second approach to CT image reconstruction defines the problem as a sys- 
tem of linear equations. Each projection ray corresponds to a linear equation 
that sums up the image pixels the ray passes through, i.e., computes its dis- 
crete line integral, and demands it to equal the measured line integral value. 
Fig. 8.14 shows an exemplary image grid and a set of projections rays. 
Accordingly, we can define the image reconstruction problem as 


Ac — p, (8.21) 
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where 


T = (#1, %2,...,0N) (8.22) 


p= (p1, pa, ---, PM) (8.23) 


are the sequentially numbered unknown pixels and measured line integrals, 
respectively. Each element a;; of the system matrix A describes the contri- 
bution of a particular pixel to a particular ray. There are many possibilities 
to model A. In the simplest case, a;; is a binary value that is 1 when the ray 
passed through the pixel and 0 otherwise. The length of intersection can also 
be used, or even the area of intersection in case we assume the rays to have 
a non-zero thickness. 

Solving this system of linear equations for the solution æ directly using 
matrix inversion (Gaussian elimination, singular value decomposition, etc.) 
is not feasible in practice as the problems are typically large, ill-conditioned 
and over-determined. Instead, an iterative solution to this system of linear 
equations is sought.” 

The algebraic reconstruction technique (ART) aims to find such an itera- 
tive solution using the Kaczmarz method. The basic idea behind this method 
is that each linear equation defines a line (2-D) or, generally speaking, a hy- 
per plane (higher dimensions) in the solution space, the dimensionality of 
which equals the number of unknowns. All points on a hyper plane fulfill its 
corresponding equation. Consequently, the point of intersection of all hyper 
planes forms the correct solution to the problem. Thus, by repeatedly pro- 
jecting the current estimate orthogonally onto a different equation’s plane, we 
iteratively improve the solution (Fig. 8.15). A simple mathematical intuition 
for the ART algorithm is given in Geek Box 8.4. 

Using Kaczmarz’ method, we can now find an iterative solution for 
Eq. (8.21). For each line integral measurement p; and each row a; of the 
system matrix A we perform the following update step, 

BHT gh 4 Pi iM ay, (8.29) 


aja] 


xz 


and repeat until convergence. 

It has been shown by Tanabe in 1971 that if a unique solution exists, this 
iterative scheme converges to the solution. However, in over-determined sys- 
tems and in presence of noise, no unique solution might be found and the 
method might oscillate around the ideal solution. The rate of convergence 
depends on the angle between the lines. If two lines are orthogonal to each 
other, the method converges very quickly as the orthogonal projection imme- 
diately finds the intersection. Thus, orthogonalization methods can be applied 


? It is worthwhile to note that while A~'p corresponds to the ideal solution, which in 
the previous section we would obtain by filtered back-projection, A ! p amounts to an 
unfiltered back-projection. 
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Geek Box 8.4: 2-D Algebraic Reconstruction Technique Example 

In the 2-D case, consider a point z and a line n! c = d, where n is the 
normal vector, c a point on the line and d the distance to the origin. 
Note that n! c describes the scalar vector product. The orthogonal 
projection z^ of x on this line must be in the direction of the normal 


vector m: 
ve =a2+ An (8.24) 


The projected point z' is part of the line and therefore fulfills 


n'a’ =d. (8.25) 


Plugging Eq. (8.24) into this equation we get 
n! (z -- An) — d, (8.26) 


which can be rewritten as 


d—n!z 


= (8.27) 


Substituting À in Eq. (8.24), we arrive at 


aıt = pi 


a2£ = p2 


Figure 8.15: Kaczmarz iterations in 2-D space. 
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in advance in order to improve convergence. However, using such methods is 
computationally demanding and amplifies noise in the measurements. 

Several extensions of this algorithm aim to improve convergence speed. 
Instead of orthogonalization, for instance, one can also use ordered subsets 
of the equations to select a better order of the performed projections. An- 
other extension, Simultaneous ART (SART), achieves a speed-up by doing 
multiple updates at the same time and then combining the results. In each 
step, the current estimate is orthogonally projected on all lines. The centroid 
of all projected points is then used for the next iteration. This results in the 
following update rule: 


k 
k+1 k Pi— ait ^ 
£ =g" +A Uk i — = a; , 8.30 
k ki aal C (8.30) 
with 


S ure, (8.31) 


where A, controls the step size in each iteration. 

Other than the presented method by Kaczmarz, there are a multitude 
of optimization approaches to solve this problem that are not covered here, 
e. g., Gradient Descent, Maximum-Likelihood Expectation-Maximization, or 
regularized reconstruction methods. There is also an immediate relation to 
analytical methods as described in Geek Box 8.5. 


8.3.3 Acquisition Geometries 


Fig. 8.16 illustrates several important acquisition geometries in CT imaging. 
Different types of CT scanners have been categorized into generations. CT 
scanners of the first generation practically realized the parallel beam geome- 
try as introduced above and shown in Fig. 8.16(a) (left). By introducing an 
array of detectors, the second generation could measure beams from several 
directions simultaneously. Only by the third generation, however, was this fan 
of directions (cf. Fig. 8.16(a), right) wide enough to remove the need for a 
translational motion during acquisition. The projections acquired using a fan 
beam geometry can be transformed (*rebinned") such that the reconstruction 
methods described earlier can still be applied. Alternatively, corresponding 
fan beam versions of the algorithms can be derived in a similar fashion to the 
ones presented. 

Another essential development is the addition of multiple detector rows 
(Fig. 8.16(b), left), leading to a dramatically increased imaging speed as many 
slices can be acquired in parallel (multi-slice CT). Another, newer kind of CT 
systems expands on this notion: by acquiring full 2-D projection images with 
an image intensifier or — more recently — a flat panel detector, cone beam 
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(a) For a parallel beam geometry as introduced in the previous sections (shown 
on the left), the X-ray source needs to be shifted perpendicularly (dotted line) to 
the direction of projection, casting pencil beams through the object. If all beams 
instead emanate from a single position for each angle, we obtain a fan of no 
longer parallel rays (fan beam geometry; on the right), increasing acquisition 
efficiency at the cost of a slightly more complicated reconstruction problem. 
Apart from the flat shape shown here, there also exist curved detectors with an 
equiangular spacing. 


(b) Multiple detector arrays allow for simultaneous acquisition of multiple im- 
age slices from one X-ray source position (multi-slice CT; shown on the left). 
However, in this setup, the beams no longer all lie within the rotation plane. 
This issue becomes much more important in the case of cone beam CT (shown 
on the right): Here, the small stack of detector rows gives way to a larger de- 
tector matrix, with the beams now forming a cone in 3-D. 


Figure 8.16: Basic acquisition geometries in CT imaging. Blue arrows in- 
dicate the trajectory of the X-ray source. The detector is depicted by thick 
black lines. 
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Geek Box 8.5: ART and its Relation to Filtered Back-Projection 


ART formulates the reconstruction problem as a system of linear equa- 
tions. 
Ac —p 


For a typical 3-D reconstruction problem with 512 projections with 
512? pixels each, p € R?!?* while the corresponding volume z € R512. 
Consequently, the operator is huge with A € R512*X512*. In order to 
store such a matrix in floating point precision about 65,000 TB of 
memory would be required. However, A is very sparse as most entries 
are equal to 0. In computer implementations, it is typically computed 
on the fly using ray-casting. Thus, general inversion of A is infeasible, 
even when using the pseudo inverse with 


a= Alp=A'(AA')!p. 


However, there are certain geometries for which (AA')~! can be de- 
termined analytically. For the case of parallel-beam geometries, we 
know that (A.A')~! takes the form of a convolution with the ramp 
filter. A' is the adjoint operation of the projection operator. In con- 
tinuous form, we introduced this operation already as back-projection 
(cf. Geek Box 8.2). Thus, filtered back-projection is a direct solution 
for the above system of linear equations. 


Figure 8.17: A C-arm system for interventional X-ray imaging. Image cour- 
tesy of Siemens Healthineers AG. 
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Figure 8.18: In spiral CT, although the X-ray source still conveniently ro- 
tates in the z-y plane, the trajectory it describes in relation to the imaged 
object is a helix due to the patient table being slowly moved through the 
gantry. This enables the acquisition of a large object region while rotating 
continuously. Projections for an ideal circular trajectory can be interpolated 
along z from neighboring helical segments. 


CT is able to capture a large field of view containing all of the object in 
a single rotation (Fig. 8.16(b), right). One of the main fields of use for this 
technology lies in interventional imaging where the X-ray source and detector 
are mounted on a C-arm device (Fig. 8.17). It has to be noted, though, that 
for arbitrary objects, an exact reconstruction is only possible in the plane of 
rotation. The more the beam diverges from this plane, the more artifacts are 
likely to appear: due to the incomplete data obtained from oblique rays, the 
reconstruction problem is underdetermined. 

For imaging larger parts of the body with few detector rows, it used to 
be necessary to perform a rotation, then halt and move the table such that 
the next slice to be acquired is lined up with the detector before starting 
anew. With the invention of helical CT, à continuous motion of both the 
rotating gantry and the table became possible. From the point of view of 
the imaged object, the X-ray source rotates in the x-y plane and moves 
in the axial direction at the same time, thus following a helix (Fig. 8.18). 
From the helical rotation, projections for all angles in an axial plane can be 
interpolated, enabling the use of standard reconstruction methods. 


8.4 Practical Considerations 


So far, we have described the theoretical background and principles for CT 
image reconstruction. However, in practice there are several aspects that have 
to be considered. 
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Figure 8.19: The effective sizes bp of the focus and bp of the detector in the 
isocenter can be calculated from the distances to the isocenter rp and rp. 


8.4.1 Spatial Resolution 


In many medical applications, we are not only interested in visualizing large 
organs, but also smaller structures, such as small blood vessels or calcifica- 
tions. Visualization of these small structures requires a high spatial resolution. 
In the following subsection, we will discuss what affects resolution in the x- 
y scan plane. Resolution in the z-direction typically needs to be considered 
separately as it depends on different factors. 

In the scan plane, resolution depends on several geometrical properties. 
Focus size, scan geometry, detector element spacing and aperture, and move- 
ment of the focus during image acquisition all influence the resolution. 

The focus size sp as well as the detector aperture sp contribute to image 
blurring, which can be modeled by 


br = —?—.sp and (8.32) 


Se E s (8.33) 


where rp represents the distance of the isocenter, i.e., the center of rotation, 
to the X-ray focus and rp the distance of the isocenter to the detector center. 
Effectively, bp and bp are the sizes of the focus and detector in the isocenter 
(cf. Fig. 8.19). Furthermore, the continuous movement of focus and detector 
during the image acquisition results in additional image blur, which we denote 
as bm. The blur that occurs during image acquisition is then described by 


baca = 4/02, + b3 + B3,. (8.34) 


However, sampling and image reconstruction also introduce additional blur, 
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Figure 8.20: A bar phantom can be used to evaluate the spatial resolu- 
tion of an imaging system. At sufficiently high spatial frequencies, individual 
lines can no longer be separated after imaging, i.e., we have determined the 
system's resolution limit. 


ba = c4: As, (8.35) 


with As the sampling distance and c4 a constant factor which represents the 
reconstruction algorithm characteristics’. Finally, the total blur is modeled 
by 


btotal = V 62. + 52, + 52, + 64, (8.36) 


whereas bacq represents the maximum spatial resolution given by the geomet- 
ric setup, which could be achieved if we were to use a very fine sampling and 
a reconstruction algorithm with a sharp kernel. It becomes obvious that as 
a user, we only have limited influence on spatial resolution. We can decide 
which convolution kernel we want to use, but the geometrical parameters are 
defined by the system's scan modes. 

Spatial resolution can be measured directly and indirectly. For direct mea- 
surement, a bar phantom can be used. Such phantoms consist of alternating 
black and white bars of varying thickness. The resolution is determined by 
evaluating whether bars of a certain thickness are still distinguishable after 
acquisition and reconstruction (cf. Fig. 8.20). A more reliable and objective 
evaluation is the indirect approach. For this purpose, we scan a thin wire 
phantom?, thereby obtaining the system's so called point spread function 
(PSF). The Fourier transform of the PSF yields the modulation transfer 
function (MTF) (cf. Fig. 8.21). Frequency is typically measured in line pairs 
per cm (Ip/cm), a unit that can be intuitively understood if we recall the bar 
phantoms of the direct approach mentioned before. The spatial resolution of 
a system is often given by the 1096 value of the MTF, which represents the 


3 E.g., in filtered back-projection, a smooth convolution kernel reduces noise but also 
spatial resolution, whereas a sharp kernel leads to more noise but yields a high spatial 
resolution (cf. Sec. 8.3.1). 


^ Essentially, this mimics a point object for each 2-D slice. 
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Figure 8.21: If we could scan an ideal point object, the resulting recon- 
structed image would be the PSF of the system. The Fourier transform of 
the PSF is the MTF, which shows the relative contrasts achievable for vary- 
ing object frequencies. In practice, this measurement is typically performed 
using thin wire phantoms. 


Figure 8.22: The left image shows the reconstruction of a water cylinder 
phantom. The noise is stronger in the center than in the peripheral regions. 
On the right side, an elliptic phantom with two high intensity insets is de- 
picted. In its center, streak noise emerges that is caused by the strongly 
attenuating structures. For both phantom simulations, 90,000 photons per 
pixel with an energy of 75 keV were used. 


frequency at which the contrast has dropped to 10% of the maximum value 
at Olp/cm. 


8.4.2 Noise 


From the considerations regarding noise in X-ray projections (cf. Sec. 7.4.3 
(p. 136)), we know that the number of photons n; measured by our detector 
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can be modeled by a Poisson distribution with an expected value of E[n;] = 
Nopa, where No is the expected value of the number of photons generated 
by the X-ray tube, and p, is the probability of a photon passing through 
the imaged object unaffected. The Poisson process can be approximated by 
a Gaussian distribution with mean value u = E[ns] and standard deviation 
c = y/ E[n,], if we are dealing with a high number of events, i.e., photons. 
For reconstruction, we convert the measured projection images to line 
integral images by taking the negative logarithm (cf. Sec. 7.3.1 (p. 126)), 


J cs — —]n x (8.37) 


Ip’ 
where + = Lc = Pa. 
Using the first order Taylor expansion, it can be shown that this transform 
leads to a new approximate Gaussian distribution with u = —Inp, and 


1 


v Nopa 


During reconstruction, the back-projection step computes a weighted sum 
of the (filtered) projection values. Hence, the object dependence of the noise 
statistics is propagated into 3-D. This can be seen in Fig. 8.22, where most 
of the noise is found in the center of the objects. Additionally, in a non- 
circular object, streak structures appear in the noise. Therefore, denoising in 
the reconstructed domain needs to take the directional nature of the noise 
generation into account. 


o= . Note that the noise variance increases with object thickness. 


8.4.3 Image Artifacts 


An ideal image reconstruction is only possible in theory. In reality we have 
to deal with different physical phenomena which are detrimental to image 
quality and can result in image artifacts. In the following paragraphs, the 
most common types of image artifacts and ways to reduce their influence will 
be discussed. 


8.4.3.1 Beam Hardening 


In practice, CT uses polychromatic X-ray sources, which leads to the attenu- 
ation of a homogeneous object being not proportional to the thickness of the 
object along the ray. A polychromatic X-ray source produces a wide, continu- 
ous spectrum of energies and X-ray attenuation coefficients are dependent on 
the energy. A detailed mathematical description of the spectrum is provided 
in Sec. 8.5. 

When an X-ray passes through an object, lower energy photons are more 
easily absorbed than higher energy photons. This effect is called “beam hard- 
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Figure 8.23: Anti-scatter grids can be placed in front of the detector to 
reject scattered radiation. 


ening”. Beam hardening results in streak and cupping artifacts. A common 
approach to deal with beam hardening is to physically pre-filter the X-rays 
using thin metallic plates, which absorb the low energy photons. An in-detail 
discussion of beam hardening, including examples, is provided in Sec. 8.5.3. 


8.4.3.2 Scatter Artifacts 


Scatter, or more specifically Compton scatter, causes X-ray photons to change 
direction and energy. A scattered photon can therefore be measured in a dif- 
ferent detector element than intended. This has an especially large effect 
when the scattered photon is measured in a detector element that normally 
would have only few photons, e.g., if a high density object like a metal im- 
plant blocks all incoming photons, the corresponding detector element only 
detects scattered photons. Scatter artifacts are noticeable as cupping and 
streak artifacts especially between high density structures. Most scanners 
use anti-scatter grids in front of the detector to reduce scatter. This grid con- 
sists of thin lead strips, separated by a X-ray transparent spacer material. It 
is placed on the detector and aligned towards the focal spot. Thus, a pho- 
ton that was not scattered can pass through the grid, while most scattered 
photons will be absorbed by the lead, cf. Fig. 8.23. 


8.4.3.3 Partial Volume Effect 


Partial volume artifacts appear mostly in low resolution images, especially in 
thick slice images. With low resolution, it is possible that one pixel consists 
of two regions with different absorption coefficients u and 42, cf. Fig. 8.24, 
which leads to streak artifacts in the reconstruction. Geek Box 8.6 describes 
the problem in more detail. This type of artifact is not often seen with state- 
of-the-art CT systems as the image resolution and especially slice resolution 
has improved drastically. 
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Figure 8.24: Two regions within a pixel with different absorption coefficients 
result in different measured intensities / for different projection angles. 
Geek Box 8.6: Partial Volume Effect 


In the first case, we observe two separate regions with the correspond- 
ing absorption equations 


l : g mAr 
Ip y pose 
where J; + Iz = Ip. Thus, the total measured intensity in this pixel is 
pip ccrtc peg Hcet. (8.40) 
which is not equivalent to the average absorption we would expect, 
I-dy e 302t22)A2 — 
a oe aa) en fog (up ua aa 


However, in the case of the orthogonal direction, we do arrive at the 
average absorption, 


pe pees eae Ut mA d (8.43) 
La p e 43-8245 — (8.44) 
= [g-e-80ntu)As 4 p. (8.45) 


which is not equivalent to Eq. (8.40). 
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Figure 8.25: Reconstructions of an electron density phantom acquired at 
tube voltages of 50kV (left), 80kV (middle) and 125 kV (right). Photon star- 
vation caused by the titanium rod at around 4 o'clock leads to pronounced 
streak artifacts. T'his effect decreases at higher tube voltages as more photons 
are produced and comparatively fewer of them are absorbed. Images courtesy 
of Jang-Hwan Choi. 


8.4.3.4 Metal Artifacts 


Metal artifacts are among the most common image artifacts in CT imag- 
ing. This term covers many different types of artifacts that we already dis- 
cussed. There are various reasons why metal artifacts can occur. Metal causes 
beam hardening and scatter, which results in dark streaks between the metal 
objects. Additionally, its very large attenuation coefficient leads to photon 
starvation behind the metal object; as most photons are absorbed, only an 
insufficient number of them can be measured, leading to noisy projections. 
'The noise is amplified in the reconstruction and will lead to streak artifacts 
in these regions, cf. Fig. 8.25. 

Metal artifacts can be reduced by increasing the X-ray tube current or with 
automatic tube current modulation. Alternatively, there are metal artifact 
reduction algorithms that try to solve this problem without additional dose. 
Some algorithms aim to remove the metal objects in the reconstructed image 
and iteratively interpolate the holes in the forward-projected images. 


8.4.3.5 Motion Artifacts 


If motion, e. g., cardiac, respiratory, or patient motion, is present during an 
image acquisition, we end up with an inconsistent set of projection images. 
This can lead to blurring or streak artifacts in the reconstructed images. 
'This type of artifact is especially prevalent with C-arm cone beam CT sys- 
tems. Due to the slow rotation speed of the C-arm, a typical abdominal or 
heart scan takes approximately 4 — 5s, during which significant respiratory 
or cardiac motion can occur. These artifacts can be reduced by estimating 
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Figure 8.26: An image of human legs reconstructed from motion-corrupted 
projections using regular filtered back-projection (left) and with an addi- 
tonal marker-based correction (right). By compensating for motion during 
reconstruction, structures that were originally blurred due to the movement 
become visible and streak artifacts caused by misalignments are reduced con- 
siderably. 


Figure 8.27: Illustration of truncation artifacts in the reconstructed slices. 


the motion field and correcting it during image reconstruction. An example 
is shown in Fig. 8.26. 


8.4.3.6 Truncation Artifacts 


Truncation occurs when a scanned object is larger than the detector area or 
X-ray beams are intentionally collimated to a diagnostic region of interest 
for saving dose. Both cases will result in laterally truncated data. Due to the 
non-local property of the ramp filter, filtered back-projection reconstruction 
requires information of the whole projections for each point in the object. This 
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requirement, however, is not satisfied anymore if projection data are laterally 
truncated. Thus, a noticeable degradation of image quality manifesting as a 
cupping-like low-frequency artifact as well as incorrect absorption values will 
be observed in the reconstruction, as illustrated in Fig. 8.27. 

A popular truncation correction is based on estimating the missing data 
using a heuristic extrapolation procedure. For instance, a symmetric mirror- 
ing extrapolation scheme could be used to reduce the truncation artifacts 
from objects extending outside the measured field of view. Also, the missing 
measurements can be approximated by integrals along rays through a 2-D 
water cylinder since it is able to approximately describe a human body. 


8.5 X-ray Attenuation 
with Polychromatic Attenuation 


Traditional CT measures the spatial distribution of the X-ray attenuation of 
an object. The X-ray attenuation of a material is energy dependent, at spe- 
cific energies it is governed by the composition of the material, more precisely 
on its mass density and the atomic number and composition of its elements. 
As described in Sec. 8.3, the common measure for X-ray attenuation in med- 
ical CT is the HU. However, for a material other than water and air, the HU 
value depends on the system design and settings of the CT device as well 
as the characteristics of the complete scanned object. Fig. 8.28 exemplar- 
ily shows the energy dependent X-ray attenuation for bone and soft tissue. 
This dependency is caused by the non-linear attenuation characteristics of 
polychromatic radiation. 


8.5.1 Mono- vs. Polychromatic Attenuation 


When a monochromatic X-ray beam at energy Eo passes through an object, 
the measured intensity Imono follows Lambert-Beer’s law: 


Tmono (Eg) = Ip (Eo) e^ J Cs Badd. (8.46) 


where Ip (Eo) refers to the intensity of the incident X-ray at energy Eo, s 
denotes the path of the X-ray traversing the object, u (s, Eo) denotes the 
spatial distribution of energy-dependent linear attenuation coefficients. 

The attenuation could be obtained by rewriting Equation (8.46) as: 


dmono (Eo) = -In em (T0) = ]^ (s, Eo) ds. (8.47) 
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Figure 8.28: Illustration of energy dependent X-ray attenuation for bone 
and soft tissue. 


Thus, for the monochromatic case, there is a linear relationship between the 
monochromatic attenuation qmono (Eo) and the intersection length of X-ray 
and the object. 

However, the X-ray sources in typical clinical scanners are polychromatic 
sources. In addition, monochromatic measurements cannot provide real quan- 
titative information as HUs are energy-dependent, i.e., different spectra and 
filters will result in different HUs. Although there exist physical ways to cre- 
ate monochromatic X-rays at sufficient intensity to perform X-ray CT, e. g., 
using a monochromator or inserting thick absorption filters to narrow the 
spectrum, these methods are very expensive and the usage is restricted to 
research experiments at few institutions. As detailed in Geek Box 8.7, in con- 
trast to the monochromatic X-ray situation, there is no linear relationship 
between the polychromatic attenuation qygy and the intersection length of 
X-ray and the object. 

Fig. 8.30 depicts the relationship between the intersection length p and 
the attenuation q when a polychromatic X-ray beam, which is emitted at a 
tube voltage of 110 kV, penetrates a homogenous aluminium object, and the 
relationship between the intersection length p and the attenuation q when a 
monochromatic X-ray beam, which is emitted at the effective energy 45.56 
keV of the aforementioned polychromatic X-ray beam, traverses the same 
object. 
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Geek Box 8.7: Polychromatic Line Integrals 


As aforementioned, in practical setups, rather than a monochromatic 
X-ray beam, only a polychromatic X-ray beam is available. By sum- 
ming the monochromatic contributions for each energy bin E in the 
X-ray spectrum gives (E € [0, Fmax]): 


Emax 
pu / S(E) D (E) -e J "Eésag, (8.48) 


where Ipoly (E) denotes the measured detector signal of a polychro- 
matic X-ray, S (E) denotes the spectral energy distribution and D (E) 
denotes the detector energy sensitivity. Fig. 8.29 shows an example of 
spectrum and the integral under it for the explanation of Equation 
(8.48). 

Consequently, adapting Equation (8.47) to polychromatic situation 
yields 


IL E Emax 
dpoly = —In eui ar | N(E).e-J"P*sqg, (8.49) 
0 0 


where 
(8.50) 


refers to the normalized energy spectrum with the effective detected 
intensity (system weighting function) Ip defined by 


ncs ji """ S (E) D (BN) dE". (8.51) 
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Figure 8.29: A spectrum and the integral under it. 
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Figure 8.30: Relationship between intersection length and attenuation. 


8.5.2 Single, Dual, and Spectral CT 


As mentioned in the previous section, standard single energy C'T reconstruc- 
tion assumes mono-energetic radiation, however, common X-ray sources for 
medical CT are polychromatic. The fact is that in single energy CT, the 
energy information of the spectral attenuation coefficient is lost due to the 
measurement process. Therefore, the polychromatic characteristics of the in- 
put spectrum are neglected in single energy CT. Instead of an input spectrum 
S (E) and a detection sensitivity D (E), an effective detected intensity lo is 
measured in a calibration step, to recover the corresponding effective atten- 
uation. In this manner, single energy CT is unable to provide quantitative 
information on tissue composition. 

On the other hand, if spectral input data is acquired, i. e., multiple mea- 
surements with different spectral characteristics are made for each projection 
ray, real quantitative information on scanned anatomy becomes possible. For 
instance, dual energy CT measures two image sets at different energy weight- 
ings, e. g., by performing two scans with tube voltages set to 80 kVp ? and 140 
kVp, respectively. Fig. 8.31 shows an example of such dual energy CT scanner 
~ Siemens Definition Flash (Siemens Healthineers AG, Forchheim, Germany). 
It employs two tube-detector pairs and produces two measurements simul- 
taneously at different tube voltages. For spectral CT applications with dual 
source data, it is desirable to use two spectra with as little overlap as possible 
in order to ensure the maximal spectral separation between the two acquired 
datasets. For this task, usually the two tubes are operated at two different 
kVp settings. Additionally, a special filter can be used to attenuate the lower 
energy components in the high energy tube spectrum. 

Although most spectral CT scanners require two spectral measurements, 
for some certain scenarios, more measurements are needed. The output quan- 


5 The peak acceleration voltage of X-ray tubes is usually given in kVp (kilovolt peak). 
An acceleration voltage of 120 kVp results in a X-ray spectrum where the individual 
photon energies are distributed in the range from 0 to 120 keV. 
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Figure 8.31: A dual energy CT scanner — Siemens Definition Flash. Image 
courtesy of Siemens Healthineers AG. 


tities of these algorithms differ with respect to the diagnostic demands: they 
range from energy-normalized attenuation values over physically motivated 
quantities like density and effective atomic number to spatial distributions of 
whole attenuation spectra. The most popular current diagnostic applications 
are bone removal, PET/SPECT attenuation correction, lung perfusion diag- 
nosis, or quantification of contrast agent concentrations, for instance in the 
myocardium. 


8.5.3 Beam Hardening 


When a polychromatic X-ray beam penetrates an object, photons with lower 
energies are easier absorbed by the object than photons with higher energies. 
Consequently, the average energy of an X-ray spectrum shifts toward higher 
energies while traversing the object, namely, the spectrum of the X-ray beam 
“hardens”. The spectrum becomes harder with increasing intersection length 
of the X-ray with the object. This effect is called beam hardening. 

Now we use an example to illustrate this effect in Fig. 8.32. We assume an 
X-ray beam is emitted at 120 kVp acceleration voltage and the material of 
the anode is tungsten. Now, we add additional layers of aluminum filtration 
to the spectrum. We can see that the corresponding effective energies E.g 
shifts from 56.57 keV to 74.18 keV with increasing thickness of the aluminum 
wedge filter. 

If beam hardening is not taken into consideration while doing reconstruc- 
tion, the reconstructed image will be contaminated by beam hardening ar- 
tifacts, which typically manifest as cupping and streak artifacts. As afore- 
mentioned, a spectrum is becoming harder when the intersection length is 
increasing. Hence, in reconstructed images, the inner part of the object is 
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Figure 8.32: Tube spectra with different amount of filtration. 


(a) (b) 


Figure 8.33: Beam hardening example: (a) A simple phantom set-up con- 
sisting of water (gray), bone (white) and air (black). (b) Reconstruction of 
the phantom with visible beam hardening artifacts. 


darker than the outer part, and a corresponding cupping appears in the 
reconstruction. Streak artifacts appear as dark bands or streaks in the recon- 
structed image. The occurrence of such artifacts is due to the fact that X-rays, 
which pass through only one dense object, are less hardened than those pass- 
ing through both dense objects. Fig. 8.33 shows a simulated example of an 
elliptical water phantom with two dense bone insets. 

Various beam hardening correction algorithms exist. For a soft-tissue cal- 
ibration, projection measurements through soft-tissue like materials of vari- 
able known thicknesses are performed. For these, the equivalent water thick- 
nesses are known. A simple function is fit through the pairs of measured and 
expected values. This function is inverted and then used as a look-up table: 
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for each measured attenuation, the equivalent water thickness is looked up, 
which then replaces the measured attenuation. If bone-induced beam hard- 
ening is also corrected, a separation into bone and water components can be 
performed. For dual energy CT, special quantitative correction methods exist. 
These take advantage of the two measurements at different energy weightings 
and special properties of the attenuation functions of body tissues. 


8.6 Spectral CT 


8.6.1 Different Spectral CT Measurements 


Spectral CT detection refers to producing multiple measurements of the same 
object with different spectral weightings. The spectral weighting is defined 
by the tube spectrum and the spectral sensitivity of the detector. In spectral 
detection techniques, one of these or both are changed between measure- 
ments. The spectral weightings should have as little overlap as possible. This 
enhances the discrimination between the spectral measurements which is ben- 
eficial for spectral CT algorithms. Usually, only two spectral measurements 
are created due to dose limitations and the fact that most spectral CT algo- 
rithms do not benefit from additional spectral measurements. This fact can 
be attributed to the specific attenuation properties of body materials in the 
CT energy range. 


8.6.1.1 Dual KVp 


The easiest method for producing spectral measurements is called Dual kVp. 
For this method, two subsequent CT scans are performed at different tube 
voltages, e. g., 80 kVp and 140 kVp; see Fig. 8.34(a) for spectra of these two 
voltages. As mentioned before, it is desirable to use two spectra with as 
little overlap as possible in order to ensure the maximal spectral separation 
between the two acquired datasets. To this end, usually a special filter can 
be used to attenuate the lower energy components in the high energy tube 
spectrum (140 kVp); see Fig. 8.34(b). 

The main advantage is that no special equipment is needed for this method. 
In medical CT, this method is prone to motion artifacts as the alignment of 
the two datasets cannot be ensured due to patient motion in between the two 
scans. However, this is a valid method for evaluating spectral CT algorithms 
on phantom data. 
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Figure 8.34: (a) Spectral measurements with different tube voltages, e. g. 
80 kVp and 140 kVp. (b) A special filter was applied to high energy tube 
spectrum to ensure two spectra with as little overlap as possible. 
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Figure 8.35: Concept of dual source CT. 


8.6.1.2 KVp-switching 


KVp-switching is another tube-based approach that switches the tube voltage 
between two readings. As read-out times are typically in the range of hundreds 
of micro-seconds, a special tube capable of changing the tube voltage very 
quickly is required. Due to dose efficiency, the tube current should also be 
adjusted for different kVp-settings as the attenuation properties of human 
body tissue are very different for different X-ray energies. The projections 
acquired with this approach are not perfectly aligned as the projections are 
interleaved. Missing projections may have to be interpolated. 
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Figure 8.36: Concept of dual layer detector. 


8.6.1.3 Dual Source 


Dual Source CT is similar to dual kVp with the two CT scans being performed 
simultaneously by a special CT system. In this system, the gantry houses 
two tube-detector pairs A and B with a fixed angular offset (see Fig. 8.35). 
The two X-ray tubes are operated at different tube voltages. More recent 
systems offer an optional tin filter on one tube to increase spectral separation 
whereas the two detectors are usually identical in terms of spectral sensitivity. 
Most available systems, however, use differently sized detectors due to space 
restrictions within the gantry. So the measurements of the smaller detector 
offer a limited field of view (FOV). The data from the larger detector can be 
used to compensate for truncation artifacts in the reconstruction but Dual 
Energy data is only available for the smaller FOV. Since the two tube-detector 
pairs are operated simultaneously, scatter radiation from tube A impairs 
the signal of detector B and vice versa. This is a major drawback of this 
technology, as this property decreases signal quality and leads to an increased 
patient dose. 


8.6.1.4 Dual Layer Detectors 


This technology uses a variation of the detector spectral sensitivity to produce 
measurements at different energy weightings. Two scintillation detector layers 
are stacked upon each other and the top detector layer is a pre-filter for the 
lower one. This technology is also referred to as sandwich detector. Fig. 8.36 
shows a possible realization of this concept. The detector efficiency is lowered, 
as the top layer photodiodes and wiring absorb parts of the X-rays and escape 
photons may enter the other layer and impair the energy separation of the 
layers. 
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Figure 8.37: Spectral sensitivity example for a counting semiconductor de- 
tector with thresholds set to 5 keV and 60 keV. Due to several effects like cross 
talk, escape photons and signal pile-up, the spectral separation is reduced by 
a considerable overlap of the sensitivity curves. 


8.6.1.5 Counting Detectors 


Spectral measurements can be conducted with counting detectors by using 
multiple energy-thresholded photon counts. Theoretically, X-ray counting for 
medical CT can be performed with scintillators and semiconductor detectors. 
As semiconductor detectors have the advantage of being very fast and hav- 
ing very limited cross-talk between channels, a lot of effort has been put in 
evaluating the suitability of these detectors for medical CT. However, still 
some issues have to be resolved before this technology becomes commercially 
available in medical CT scanners. Counting detectors perform especially good 
at low X-ray flux. At high flux levels, which typically appear in medical CT, 
several problems arise: Signal saturation prevents distinction of individual 
detection events and polarization of the semiconductor material affects the 
signal quality. Due to physical effects, material defects, and technical lim- 
itations, the discrimination of X-ray quanta cannot be perfect. This leads 
to a limited spectral separation between the spectral sensitivities for each 
threshold signal which is dependent of the incoming X-ray flux. Fig. 8.37 
shows spectral sensitivities for thresholds producing photons counts below 
and above 60 keV at low X-ray flux and their overlap for a 140 kVp tube 
spectrum. 
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8.6.2 Basis Material Decomposition 


Fully Spectral CT approaches generally yield measures that are directly re- 
lated to physical properties of the imaged object or tissue. Unlike HUs, these 
measures should not be system-dependent or be influenced by the surround- 
ing object. The following section introduces basis material decomposition, 
which yields two or more effective basis material densities to characterize 
underlying material compositions. 

In general, the spectral attenuation coefficient of a material can be ex- 
pressed as a linear combination of M energy-dependent basis functions f;(£): 


M 
p (r, E) = 26 (r) f; (E) , (8.52) 


where c; (r) denotes the spatially-dependent coefficients, in which r = (x, y, z) 
refers to the spatial location information. 

'The principle of material decomposition is based on the fact that the spec- 
tral attenuation coefficients of body materials are dominated by two effects in 
the energy range of medical CT: photoelectric absorption and Compton scat- 
tering, as described in Sec. 7.3 (p. 125). Since two basis materials are sufficient 
to express u (r, E) for body materials with very small errors, a separation of 
the the energy-dependent basis functions f; (E) from the spatially-dependent 
coefficients c; (r) is possible. The typical choice for basis functions in medical 
CT is a set of water and bone mass attenuation functions. We denote the 
basis functions fw (E) and fp(E), with fw (E)-component corresponding 
to the mass attenuation coefficient of water and fp (E) to femur bone. The 
corresponding basis material coefficients are denoted cw (r) and cpg (r). For 
this basis material set, Equation (8.52) reads: 


p (£, E) = ew (r) - fw (E) + cs (£) - fe (E). (8.53) 


Two methods to recover cw (r) and cp (r) are presented in Geek Boxes 8.8 
and 8.9. 
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Geek Box 8.8: Projection-Based Basis Material Decomposition 


Inserting Equation (8.53) into the line integral of the spectral atten- 
uation law yields: 


i ea = ES) | A | eae eD 


We denote the line integral over the water coefficients Aw = 
J cw (r) ds. The integral Ag over the bone coefficients is defined along 
the same line. 

Conducting a dual energy measurement at two energy weightings 
Sı (E) Dı (E) and S5(E) Də (E) gives the following system of non- 
linear equations: 


I= n Sı (E) Di (E) e fw) 4w- fo) 48g p (8.55) 
0 


pA f S, (E) D; (E) e- fw) 4w-fo(2)4n dE (8.56) 
0 


This system has to be solved for Aw and Ag, which is the scope 
of current research. Then, the basis material coefficients cw (r) and 
cp (r) can be recovered from Aw and Ag with a plain inverse Radon 
transform as used for standard CT reconstruction. 

A general drawback of projection data-based methods is the require- 
ment of perfectly matched projections and not all dual-energy detec- 
tion techniques are able to measure line integrals at exactly the same 
positions for each spectrum. 
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Geek Box 8.9: Image-Based Basis Material Decomposition 


Image-based basis material decomposition avoids the matched projec- 
tion problem by performing the decomposition in the reconstruction 
domain. For this purpose, the reconstructed attenuation values need 
to correspond to a constant, known energy weighting throughout the 
CT volume. For data measured with a polychromatic source, this can 
be achieved by a quantitative beam hardening correction. It homoge- 
nizes the energy weighting throughout the reconstructed CT volume. 
The homogenized energy weighting is denoted «5; (E). Here, i num- 
bers the N; spectral measurements. As for projection data-based basis 
material decomposition, multiple measurements at different energy 
weightings are required. The relation between spectral attenuation 
coefficient and measured attenuation coefficient after beam hardening 
correction ji; (r) is defined by the energy weighting: 


RU f a (8.57) 


With two basis material decomposition of u (E, r) (8.53), we get 


Oe | ” tis (B) (ew (2) fiw (E) + cn (e) fa (E)) dE — (858) 


Here, we can exchange summation and integration, 


(8.59) 
The complete basis material decomposition with all measurements 
then leads to the following linear system of equations: 


ji(r) = K e (r) (8.60) 


with (0) = Ga (D) a B (ew (r) , es (0) 
an 
K = [5° à; (E) fw (E) dE J ix (E) fa (E) dE]. 


'The quantitative accuracy of the image-based basis material decom- 
position approach depends on the accuracy of the beam-hardening 
correction and the image quality of the resulting basis-material im- 
ages is reduced since the solution of Equation (8.60) is very sensitive 
to noise in the input data. So far, more advanced image-based mate- 
rial decomposition methods have been developed to overcome these 
drawbacks. 
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9.1 Introduction 


Modern medical imaging is achieved with state-of-the-art devices, that use 
cutting-edge technology from different fields of engineering. Oftentimes, 
progress is driven by discoveries in a at first sight seemingly unrelated field 
of research. This enables the construction of new devices that was thought to 
be impossible before. In this chapter, we introduce a new imaging modality 
that has the potential to develop into a future medical imaging technology: 
X-ray phase-contrast imaging. At its current state, its medical use still has 
to be demonstrated. Yet, several early experimental results indicate that it 
has some potential for clinical applications. 

Conventional X-ray imaging measures the attenuation of X-rays. X-ray 
attenuation happens due to different interactions between X-ray photons 
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Geek Box 9.1: Complex Index of Refraction 


As stated in Chapter 5, light propagation can be modeled with the 
index of refraction n. When operating at X-ray energies, the index of 
refraction is expressed as a complex number, 


Ncomplex = Juve iB , (9.1) 


where the imaginary coefficient 8 models attenuation and the real 
coefficient 6 models phase shift. At X-ray energies, Ncomplex is very 
close to 1, i.e., little attenuation and refraction occurs. For example, 
for a transition between vacuum and water at a (relatively low) X-ray 
energy of 20keV, 8 = 3.99411- 1071? and ô = 5.76149 - 10-97. Note 
that for this configuration, 6 is by about a factor 1000 larger than 
DB. If we loosely identify 6 with the quantity measured in traditional 
X-ray, and 6 with the quantity measured in phase-sensitive X-ray, we 
get an intuition about the vision of early attempts to translate phase- 
sensitive X-ray into the hospital: shouldn’t it be possible with phase- 
sensitive X-ray systems to obtain a signal that is 1000 times stronger 
at a radiation dose equal to traditional X-ray? Later research has 
shown that the necessary compromises in system design to measure 
phase consume most of this advantage. 


and matter (see Chapter 7 for details). However, attenuation does not fully 
describe X-ray interaction with matter. In the early 1930s, phase contrast 
microscopy, described in Chapter 5, was introduced. It measures refraction 
of visible light, which provides an alternative to standard transmission mi- 
croscopy for mostly translucent objects, such as biological cells. Since visible 
light and X-ray are both electromagnetic waves, it is theoretically possible 
to transfer this principle to X-ray imaging. To measure X-ray refraction has 
two motivations: First, it may allow visualizing materials whose attenuation 
properties differ only slightly. Second, it was assumed that refraction infor- 
mation would deliver improved contrast over attenuation (see Geek Box 9.1 
for details). 

However, since the wavelength of X-rays is several orders of magnitude 
smaller than the wavelength of visible light, it took several decades until 
manufacturing technologies were developed to realize X-ray phase-contrast 
imaging. 

Several systems have been designed for phase contrast imaging. While all 
of these approaches are highly interesting from a physics point of view, we 
limit the presentation in this chapter to the Talbot-Lau Interferometer (TLI). 
A short description of other approaches can be found in the Geek Box 9.2. 
Most of these other approaches are more sensitive than TLI, but TLI’s relative 
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Geek Box 9.2: Other Setups 


A number of different systems have been proposed to measure the 
phase of light at X-ray energies. Below is a list of setups that have 
been regularly mentioned in the recent literature. 


e Propagation-based: This is probably the simplest phase-sensitive 
design. Phase is measured via interference: a wavefront that is re- 
fracted by a material interferes with itself [19, 22]. To compute the 
refraction (and hence the phase), it suffices to acquire two or more 
traditional X-ray images with varying detector distance. On the 
downside, interference only occurs if the X-ray source has a very 
small focal spot and the distance between object and detector is 
large enough. X-ray tubes with a small focal spot currently suffer 
from low flux, which limits practical applications. 

Edge illumination: This setup aims at directly measuring refrac- 
tion by placing absorption masks in the beam path [14]. Depending 
on how much the object refracts the beam, a larger or smaller part 
of a detector pixel is illuminated. This design is conceptually very 
straightforward, but due to the direct measurement of the angle of 
refraction, it is less sensitive than other systems. 
Analyzer-based: Analyzer-based systems operate on monochro- 
matic X-rays, which allows to precisely measure the refractive angle 
for a material [6, 4]. The beam is reflected by a crystal (the so- 
called “analyzer”) behind the object. This crystal has the special 
property that it reflects radiation only at the Bragg angle, i.e., in 
a very narrow angular interval. Rocking the crystal allows to pre- 
cisely measure all occurring angles of refraction. While this setup is 
extremely sensitive to mechanical motion and requires monochro- 
matic X-rays, it is unmatched in its sensitivity and dose efficiency. 
Speckle tracking: Another approach to direct phase measure- 
ment is to track the refraction of a predefined pattern in the beam 
path [3]. One can obtain such a pattern by introducing for exam- 
ple a sheet of sandpaper in the beam path. Speckle tracking works 
best on thin samples and also requires an X-ray source with a small 
focal spot. 


mild system requirements make it currently the most attractive system for 
implementation in a clinical setup. In particular, it only requires additional 
gratings to be mounted between a regular medical X-ray tube and detector to 
be operated. Note that all of the methods presented in this chapter are current 
research topics and none of them are clinically used at present. However, it is 
expected that the presented methods will have clinical impact in the future. 
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We introduce some physical preliminaries, and then the Talbot-Lau in- 
terferometer itself. As an outlook, we present potential applications of this 
modality in medicine and visual inspection. 


9.2 Talbot-Lau Interferometer 


One of the most promising phase-sensitive setups for medical applications 
is the so-called Talbot-Lau interferometer (TLI) [5, 15]. It roots in Young’s 
famous double-slit experiment from 1801 to demonstrate interference. Young 
performed the experiment with visible light, but it can be directly adapted to 
X-ray. A sketch of this experiment is shown in Fig. 9.1 (left). The light source, 
in our case an X-ray source, is shown on the left. We assume that X-rays only 
emerge from a small source point, indicated by the narrow slit on the left. 
There is a barrier with two narrow slits in the beam path at some distance 
before an X-ray detector. The observation of the double-slit experiment is 
that an interference pattern shows at the detector, that is a regular pattern 
of bright and dark spots. The origin of the interference pattern is illustrated in 
Fig. 9.1 (right). The interference pattern is determined by the path difference 
Ad of the two traveling X-ray waves from both slits, namely 


Ad = dsin(0) , (9.2) 


where d is the distance between the two origins. Here, we assume that the 
distance D between the slits and the screen is large, such that 0 is close to 
zero and 0 ~ J. Constructive interference shows as a bright spot. It occurs 
when the waves arrive “in phase”. This is the case if the path difference is 
zero or an integral multiple of the wavelength, 


Ad = dsin() 2 mA, meZ. (9.3) 


Destructive interference appears as a dark spot. It occurs if the path length 
differs by half the wave length, 


Ad = dsin(0) = (m+ 2 , MEZ. (9.4) 


Gray spots are observed if the difference of path length is between these two 
cases. 
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Figure 9.1: Young's Double-Slit Experiment 
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Figure 9.2: Talbot-Lau interferometer. Between X-ray source and detector, 
the three gratings Go, G1, G2 form an interferometer. 


9.2.1 Talbot-Lau Interferometer Setup 


The Talbot-Lau interferometer makes use of the interference effect. It consists 
of a conventional X-ray device with three rectangular gratings Go, G4, and 
G3 in the beam line (Fig. 9.2). These gratings typically have a period between 
1 and 20 pm and a duty cycle of 0.5, i.e., grating bars and slits have the same 
width. G4 is a phase grating and constitutes the core of the interferometer. 
Go and G2 have supporting functions and will be introduced later. 

At slits of grating G1, the wave passes without notable modification. At 
bars of grating G1, the phase of the incoming wave is shifted by an additive 
factor between 0 and 27. This is a trick to construct from Young's double-slit 
experiment an actual imaging system. Constructive or destructive interfer- 
ence can now happen between wave sections traveling through two grating 
slits (as stated above), or between wave sections traveling through a grating 
slit and a grating bar. In the second case, the path difference Ad between a 
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Figure 9.3: Simulated Talbot carpet: interference pattern behind grating 
G travelling from left to right. 


grating bar and a grating slit must now add up to the imprinted phase shift for 
constructive interference. Due to the Talbot effect, these interference patterns 
cleanly overlay at the so-called Talbot distances. Fig. 9.3 shows a simulation 
of the interference pattern behind grating G4. The wave is assumed to travel 
from left to right. The detector is placed at a location where constructive 
and destructive interference form a clearly distinguishable black-and-white 
pattern. However, the pattern periodically repeats, which is why a system 
designer can in theory choose from infinitely many positions to place the de- 
tector. In practice, however, the pattern quickly washes out, such that the 
detector has to be located at one of the first replications of the pattern. 

Gratings Go and G2 are used to solve several engineering problems such 
that the interferometer can be built within compact medical X-ray setups. 
'The interference pattern at the detector is normally much smaller than a 
single detector pixel. To resolve the pattern, the analyzer grating G is used. 
G2 is an absorption grating such that only a part of the interference pattern 
passes through the slits onto the detector. This makes it possible to sample 
the interference pattern by taking X-ray images while moving G2 along the 
pattern (which is further discussed in Sec. 9.2.2). 

Grating Gp addresses another practical problem, namely the size of the 
focal spot. Interference effects can only be observed on coherent waves. In the 
sketch on Young's double-slit experiment in Fig. 9.1, coherence is obtained 
automatically by the small slit behind the X-ray source, which effectively acts 
as a small focal spot. There exist so-called microfocus X-ray sources that do 
provide such a small focal spot but they produce only very few X-rays per 
time. For practical applications, the imaging times are typically much too 
long. Medical X-ray tubes produce orders of magnitude more X-rays, which 
allows to take an X-ray image within a fraction of a second. However, such 
X-ray tubes can only be built with a focal spot that is much larger, typically 
between half a millimeter and a millimeter. To overcome this issue, grating Go 
is used, which can be seen as a long array of micrometer-sized slits. Each slit 
acts as a microfocus spot and creates an interference pattern at the detector. 
'The distances between the slits are now chosen in a way that the pattern of 
each of the slits exactly overlays with the pattern of the other slits. If the 
setup parameters are chosen correctly (cf. Geek Box 9.3), all these periodic 
structures at the detector are aligned and add up. This enables imaging using 
conventional X-ray sources. 
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Geek Box 9.3: Talbot-Lau Setup Parameters 


As described in Section 9.2 the TLI consists of three gratings, a med- 
ical X-ray source, and a detector. Many details on the construction of 
a Talbot-Lau interferometer can be found for example in the Ph.D. 
thesis by Martin Bech [2]. 

Optimization of the imaging performance of the whole setup requires 
to explore a huge parameter space. Each grating design already has 
multiple degrees of freedom. First of all, the grating material has to be 
chosen. Typical materials are gold, nickel and aluminum. The choice 
of the material and the manufacturing technology can constrain the 
other parameters of the grating, such as the height, period and duty 
cycle. The duty cycle is the ratio between the width of a grating bar 
and one grating period. Typical grating periods lie in the rage between 
1pm and 10pm. Additionally, the grating aspect ratio, which is the 
grating height divided by the width, is currently limited to about 50. 
The design of one grating can not be done in isolation. Instead, the 
whole imaging system has to be considered. For example, The Talbot 
effect yields a limited set of possible distances between G and G» for 
a specific energy. Another example is the Go grating, where the Lau 
effect can be used to fix either its period or its position. 

Due to the huge parameter space, parameters dependencies and the 
polychromatic spectrum of medical X-ray tubes, optimization of a 
setup is challenging. In practice, one tries to hold most of the param- 
eters fix. Then, the remaining parameters space is explored by sim- 
ulating the corresponding interferometer using numerical wave front 
propagation algorithms. 


One quality measure of the TLI can be derived from the system setup 
directly, the so-called sensitivity s, 


"T dist(G1, G2) 


(9.5) 


defined by the distance between G1 and G2 and ps which is the period of the 
analyzer grating G2. The sensitivity can be interpreted as an “amplification 
factor” of the refractive angle. 


9.2.2 Phase Stepping and Reconstruction 


Phase stepping denotes the process of shifting one of the gratings (typically 
G2) by a fraction of its period to sample the interference pattern. This is 
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Figure 9.4: Left: phase stepping. The Talbot carpet is sampled at different 
positions of grating G2. Right: the intensities at different stepping positions 
form the sinusoidal phase stepping curve. From this curve, attenuation, dif- 
ferential phase and dark-field can be calculated. 


shown in Fig. 9.4 (left): the Talbot carpet forms the interference pattern. At 
equidistant stepping positions, the detector records an image. The intensity 
values for a pixel at different stepping positions is called phase stepping curve. 
This curve can be considered as a convolution of the profile of Ga with the 
intensity pattern of the G4 pattern. Convolution of two rectangular functions 
of G4 and Gə leads ideally to a triangular signal. In reality, blurring from the 
slits of Go leads to a sinusoidal curve that is fitted to the measured steps. 
These steps are shown in Fig. 9.4 (right). After fitting a sine function to 
the phase stepping curve, three quantities can be calculated: attenuation, 
differential phase, and dark-field. Attenuation is the offset of the intensities, 
which can be computed as the average of all intensities of the phase stepping 
curve. Differential phase is the phase offset of the sine. Dark-field is one minus 
the ratio between amplitude of the curve and two times attenuation. 

In practice, these three quantities cannot be calculated directly. Instead, 
it is necessary to acquire two scans: a reference scan and an object scan. 
The reference scan is acquired without an object in the beam path such that 
it only shows the background. It captures inhomogeneities in the setup, for 
example, in the grating bars. The object scan is acquired with an object 
before or behind G,. The reference scan is used to normalize the object 
scan after calculating attenuation, differential phase, and dark-field for both 
scans. Several works address the further suppression of imaging artifacts in 
software [9, 8, 11]. More details on the calculation of attenuation, phase, and 
dark-field can be found in Geek Box 9.4. 

A common metric for the quality of an interferometer is visibility. Visibility 
is a measure of contrast of the intensity modulation and is given by the sine 
amplitude divided by its offset. Thus, the dark-field signal corresponds to the 
reduction in interferometer visibility, for example, due to micro scattering. 
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Geek Box 9.4: Signal Reconstruction 


Let us formalize the computation of attenuation, differential phase 
shift, and dark field information for each pixel. Let R denote the wave 
profile of the reference scan and O the wave profile of the object scan 
consisting each of n phase steps. The calculation is performed for 
each pixel individually which is why we omit the pixel coordinate in 
the equations below. Attenuation is obtained as the average reference 
signal over the average object signal by computing 


TL 


SU 


t=! 


DO 
i=l 


L = —ln 


The computation of the differential phase shift requires to find param- 
eters for the sinusoidal phase stepping curve. The most robust way to 
do this is to perform least-squares curve fitting, which gives offset and 
amplitude of the sine. For the computation of the differential phase, 
the phase of object and reference wave profiles are subtracted. 

The dark-field information is a function of the visibilities of the 
reference scan Vg and the object scans Vo. Vg is computed as 


max(R) — min(R) 
max(R) + min(R) ' 


Vg = 


where the maximum and minimum operators refer to the maximum 
and minimum function values of the fitted sine curve on R. The visi- 
bility Vo is computed in the same line. Then, the dark-field is defined 
as 


The visibility of the reference signal is considered an important figure of merit 
of the interferometer as it determines the noise in the differential phase and 
dark-field images. 


9.3 Applications 


An example for the three resulting signals is shown is Fig. 9.5. The shown 
gummi bears are modified with artificial defects, namely powder, a needle, 
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Figure 9.5: Top left: photograph of gummi bears with artificial defects. At- 
tenuation (top right), differential phase (bottom left) and dark-field (bottom 
right) visualize different modifications in the gummi bears. Differential phase 
is particularly sensitive to high-frequency variations in the signal. Dark-field 
is particularly sensitive to micrometer-sized structural variations, such as 
powders or fibers. Pictures courtesy of ECAP Erlangen. 


and a toothpick. In the top right, the absorption image is shown, which clearly 
shows the metal needle. On the bottom left, the differential phase is shown, 
which shows a large amount of high-frequency details, including the center of 
the needle head and the powder structure. In the bottom right, the dark-field 
signal is shown, which is particularly sensitive to the fine-grained structural 
variations in the powder and the toothpick. 

An example scan on biological data is shown in Fig. 9.6. From left to 
right, the images show attenuation, differential phase, and dark-field of a 
female breast [10]. Particularly interesting in this visualization is the dark- 
field image, which is particularly sensitive to microcalicifications in the breast, 
a common indication for breast cancer. 

Overall, X-ray attenuation visualizes both variations in density and atomic 
number. Thus, for example, it excels at the visualization of bones. Bones are 
more dense than the surrounding tissue and also contain a substantial amount 
of calcium. 

Phase information on the other hand is sensitive to variations in electron 
density. Thus, it is expected to deliver increased contrast over attenuation 
when there is a similar elemental composition between two structures. One 
example is imaging of soft tissues. Talbot-Lau interferometers can only obtain 
differential phase information. Hence, its main advantage is in imaging high 
frequency details, such as edges that lie perpendicular to the grating bars. 
Conversely, it is less effective for imaging low-frequencies information. 

Dark-field imaging provides two interesting properties that set it apart 
from absorption and differential phase: First, dark-field is sensitive to den- 
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Figure 9.6: Attenuation, differential phase, and dark-field images of a fe- 
male breast [10]. Microcalcifications in the dark-field signal (red arrows) can 
indicate cancer. 


sity variation at nano- and micrometer scale. This is typically below the 
resolution of the detector, and as such too small to be resolved in attenua- 
tion imaging. Second, when scanning ordered structures such as fibers, the 
dark-field intensity varies with the angle between the fibers and the gratings. 
Both properties together can be used to deduce the orientation of fibers below 
the resolution limit of the detector [7, 12, 1]: when performing tomography 
on a plane in which micrometer-scaled fibers are located, the dark-field signal 
oscillates. From this oscillation, the direction of the fibers can be deduced, 
although the individual fibers are too small to be resolved by the detector. 
An example is shown in Fig. 9.7. On the left, a wooden block with different 
layers of wooden fibers is shown. This block is scanned in a tomographic 
setup, and the fiber orientations are deduced from the signal oscillations [1]. 
On the right of Fig. 9.7, the reconstructed orientations are shown, in different 
colors per layer. 

Overall, phase-contrast and dark-field signals offer several interesting prop- 
erties. It depends on the imaging task to decide whether the offerings of phase 
and dark-field signals or conventional attenuation is advantageous. The list 
below enumerates applications where Talbot-Lau imaging can potentially of- 
fer an advantage over traditional absorption imaging. 


» Mammography. Mammography relies on imaging soft tissue structures 
in the breast, as well as on the detection of micro-calcifications. Imag- 
ing of soft tissues could benefit from the phase signal since it provides a 
strong signal to noise ratio at high frequencies. It has also been shown 
that the dark-field signal can reveal micro-calcifications that are invisible 
in the attenuation image since their porous structure creates dark-field 
signals [13]. 
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Figure 9.7: Layered wooden block (left) and example dark-field reconstruc- 
tion of the dominant fiber orientations in each layer (right). 


e Lung imaging. The human lung relies on millions of small alveoli to per- 
form gas exchange. Lung diseases such as chronic obstructive pulmonary 
disease (COPD) and pulmonary fibrosis lead to a change in the structure 
of the alveoli. However, due to their small size, they cannot be resolved 
individually. The dark-field signal is able to detect abnormalities in the 
alveoli microstructure and may thus provide a benefit when diagnosing 
these diseases [23]. 

e Bone imaging. The directionality of the dark-field signal can possibly be 
used to detect osteoporosis, which can lead to less aligned structures in the 
bone [21]. Furthermore, the phase signal can visualize low contrast struc- 
tures such as cartilage and tendons which may not be visible in attenuation 
imaging. 

* Micro-CT. CT scanners which provide high resolutions (voxel sizes in 
the size of micrometers) are called Micro-CT systems. They are used for 
analyzing small samples and animals. At high resolutions, phase-contrast 
CT delivers a higher image quality than conventional CT [16]. This can 
be explained by the fact that the recorded phase signal is differential. A 
first Talbot-Lau Micro-CT system is commercially available [20]. 

e Industrial applications. Beyond medical imaging, dark-field imaging 
has been applied to non-destructive testing [18, 17]. For example, it can 
detect defects in carbon fibers or foreign bodies in food that are unde- 
tectable using attenuation imaging. 
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9.4 Research Challenges 


The Talbot-Lau X-ray interferometer is an emerging image modality. Sev- 
eral questions need to be addressed before it can be applied in a clinical 
environment. Most notable issues are: 


e Grating manufacturing and mechanical setup. Manufacturing grat- 
ing structures with a period of a few micrometers and a sufficient grating 
height (in order to block high energy X-rays) is challenging. Thus, with cur- 
rent technology, the efficiency of the Talbot-Lau interferometer decreases 
with increasing photon energy. Additionally, the diameter of the gratings is 
limited to a few centimeters. Stitching procedures, which combine smaller 
gratings into a big grating, as well as smart scanning approaches that are 
compatible with smaller gratings are currently under research. Due to the 
small grating period, system stability in the nanometer range is required 
during operation. This stability is difficult to achieve in clinical environ- 
ments. 

e Medical applications and dose. A Talbot-Lau system is approximately 
half as dose-effective as a conventional X-ray system since the G grating 
ideally absorbs half of the radiation. This leads to the question whether 
the additional information provided by a Talbot-Lau system is worth a 
reduction in attenuation dose efficiency. Economical aspects also need to be 
considered: manufacturing the gratings and their support structures adds 
considerable costs to an X-ray system. Is the benefit of the information 
provided by the Talbot-Lau system worth this cost? 

* Optimal system design. Each grating introduces a new set of parame- 
ters into the system design (see Geek Box 9.3). Furthermore, the system 
does not only have to be optimized for attenuation, but also for phase and 
dark-field imaging performance. Due to this complexity, determining the 
optimal setup parameters (under the constraints provided by e. g., manu- 
facturing) for a specific imaging task is still an open problem. 

* Image processing algorithms. Image processing algorithms are needed 
in many steps of the conventional X-ray imaging pipeline, for example, for 
artifact correction, denoising, and visualization. Talbot-Lau interferome- 
ters suffer from additional issues (e.g. artifacts due to non-exact grating 
alignment). Also the differential phase and dark-field are affected by simi- 
lar artifacts as the attenuation image, such as beam hardening. Addition- 
ally, the information retrieval from phase stepping data itself requires a 
reconstruction algorithm. Thus, image processing can be considered as a 
necessary component of a Talbot-Lau imaging system. 

e Tomographic reconstruction. The phase information obtained by a 
Talbot-Lau interferometer can be reconstructed in a similar way to atten- 
uation information by using an appropriate reconstruction filter, the so- 
called Hilbert filter. However, the dark-field signal (which contains scatter- 
ing information) is directional and also influenced by signals at the object 
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edges. This makes its tomographic reconstruction challenging and has led 
to the development of dedicated algorithms to solve this problem [1]. 
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10.1 Introduction 


In contrast to the structural imaging used to visualize tissues in the body, 
functional imaging is used to observe biological processes. In the field of 
nuclear medicine, functional imaging relies on radioisotopes that are tagged 
to tracers whose biochemical properties cause them to congregate at regions 
of diagnostic interest in the body. As opposed to transmission tomography 
with X-ray CT, where the source of imaging radiation is a part of the imaging 
device, the radiation source in this case is located within the patient. For 
this reason, functional imaging methods in the field of nuclear medicine — 
also known as molecular imaging — belong to a family of modalities called 
emission tomography, whose differing physical properties make them quite 
distinct from the transmission case. 

'The process begins with radioactive decay, which results when an unstable 
isotope ejects particles from its nucleus while transitioning to a stable state. 
Although a very complicated process, two modes are of interest to molecular 
imaging: y and B. In the former case, gamma rays are ejected directly from 
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the nucleus that can be imaged with a so-called gamma camera. 3-D images 
can then be reconstructed from 2-D projections in a process called SPECT. In 
the second case, a positron is emitted which travels a small distance until an 
electron (its antiparticle) is encountered. The ensuing annihilation produces 
a pair of 511 keV photons traveling in opposite directions that, when detected 
simultaneously, yield lines of response that can be reconstructed into an image 
in a process known as positron emission tomography (PET). 

Although X-rays produced by bombarding targets with electrons had been 
in use since their discovery by Róntgen in 1895, the use of naturally decay- 
ing radioisotopes for medical imaging did not occur until 1935, when George 
de Hevesy investigated rats injected with radioactive ??P. Using a Geiger 
counter, de Hevesy investigated the relative amount of radioactivity in differ- 
ent organs after dissection and found that the skeleton had a disproportion- 
ately high level of uptake. In doing so, he not only settled once and for all the 
ongoing medical question of whether or not bones have an active metabolism 
(they do, otherwise they would not have taken up the ??P atoms), but he was 
also the first to use radioisotopes and imaging equipment to investigate the 
body's biochemistry. Thus, the so-called tracer principle was born. For his 
work in the field of radiotracers, de Hevesy was awarded the Nobel Prize for 
Chemistry in 1943, and a variant of his original bone-imaging methodology 
based on phosphates is still in wide use today. In the decades following de 
Hevesy's discovery, research from the field of radiochemistry and molecular bi- 
ology have yielded a plethora of tracers with desirable uptake characteristics. 
Complimentary technical advances have provided imaging devices capable of 
aiding physicians answer a range of diagnostic questions. 


10.2 Physics of Emission Tomography 


10.2.1 Photon Emission 


Although the properties of y and 8 decay are different in many respects, 
they follow the same basic decay law. Namely, the amount of radioactivity S 
(expressed in Bequerel, or decays per second) in a given sample of radioactive 
material will decrease until all atoms in the sample reach a stable state. This 
process follows an exponential curve, and the amount of activity in the sample 
at a given time t can be expressed as 


S(t) = Soe” Ino gs (10.1) 


where So is the initial activity, and t: /2 is the isotope's half-life, or the time it 
takes for S(t) to decrease to half of So. This process is illustrated in Fig. 10.2, 
where the blue curve depicts the amount of activity remaining in a sample 
that initially contained 100 MBq. It can be seen from inspection that the 
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Figure 10.1: Simplified representation of both modes of decay relevant to 
emission tomography. On the left is a nucleus undergoing y decay and emit- 
ting a single photon directly. On the right is an example of 6+ decay, where a 
positron is ejected from the nucleus. The positron travels a short distance be- 
fore coming in contact with an electron. The resulting annihilation produces 
a pair of 511keV photons traveling in opposite directions. 
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Figure 10.2: Exponential decay curve for a 100 MBq sample of a radioiso- 
tope having a half-life of six hours (the same as 99"Tc). 


isotope’s half-life is six hours, which is the same as that of ?9""Tc, the most 
commonly used isotope for SPECT imaging. 

Although Eq. (10.1) represents a sample's aggregate decay properties, the 
emission of individual photons (or photon pairs for 3 decay) within a partic- 
ular time window is a discrete process and follows a Poisson distribution with 
a mean v proportional to the amount of radioactivity present. Note that we 
can assume independence between the voxels and describe an entire image 
in a vectorized format using for a single voxel and v for an entire image. 
Similarly, the number of photons counted in a particular observation of this 
process, such as a pixel of a SPECT projection, is a Poisson-distributed ran- 
dom variable as well, provided that the image formation chain is linear.! If 


! [n practical scanners, this is not strictly the case, but it is assumed to be here. 
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we represent the projection pixels and image voxels as vectors, the distribu- 
tion of photon counts on the detector D is related to the activity distribution 
being imaged v via the following relation: 


D ~ Poisson (Av), (10.2) 


where A € R^ is known as the system matriz and is composed of elements 
am,n representing the probability that a photon emitted from voxel n is de- 
tected at pixel m (cf. Geek Box 7.3). M and N are the numbers of detector 
pixels and image voxels, respectively. Multiplying an image vector v by A 
thus accomplishes a forward projection into the projection space. Acquired 
projection data d from an emission tomography scan therefore represent a 
single sample, or observation, drawn from the distribution D. 

Eq. (10.2) implies that detected images will always be perturbed by ran- 
dom noise, particularly for small numbers of counts. This effect is shown in 
Fig. 10.3, where simulated observations are shown for time points t = 0, 
20,/5, t = 4t», and t = 6t,;5. For each simulation, the total activity from 
the blue curve in Fig. 10.2 corresponding to the time point t was distributed 
uniformly inside the ellipsoidal object, yielding an amount of activity at each 
pixel n that corresponds to the mean value of a Poisson process r;(t). A 
random number was then drawn from the Poisson distribution at each pixel 
to create the images d(t). This is equivalent to applying Eq. (10.2) with A 
set equal to the identity matrix. 

Central profiles drawn from each image along the blue line on the left of 
Fig. 10.3(a) are shown at the right. At the aggregate level, the simulated 
mean across all the pixels d(t) is almost exactly equivalent to the true D(t) 
and follows the predictable decay curve in Fig. 10.2. However, the noise level 
in the images and profiles appears to increase with t. This behavior is due to 
the fact that the mean of a Poisson distribution is equivalent to its variance. 
But if the variance decreases with the mean, then why does the noise appear 
to increase? To answer this, we can define a signal-to-noise ratio SN R within 
our homogeneous ellipsoid's boundaries to use as a noise measure. In this case, 
our signal is simply the mean over this object, and the noise is the standard 
deviation og: 


01] MR ENNE AV (10.3) 
Od vd 
The SNR is thus simply the square root of the mean number of photon counts 
in the object and increases monotonically, albeit with plateauing benefits, 
with the number of counts in the image. 

In X-ray CT imaging, where the radiation source is located outside the 
body and can be easily controlled by the system, large numbers of photons 
are easily attainable, as the patient can be irradiated with a high flux for a 
short period of time. However, in molecular imaging the radioactive source is 
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Figure 10.3: Simulated images (left) and central horizontal profiles (right) 
from an object filled with the activity described in Fig. 10.2. The images were 
simulated after zero, two, four, and six half-lives (a, b, c, and d, respectively). 
The mean value of the object is shown with a dashed red line through each 
profile. Note how the images become noisier as the mean decreases. 
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0 


Figure 10.4: Typical 15 second projection from a skeletal SPECT acquisi- 
tion. Even pixels with the highest counts have only roughly ten photons. Im- 
age courtesy of the University Hospital Erlangen, Clinic of Nuclear Medicine. 


located within the body and will continue to bombard tissue with potentially 
harmful ionizing radiation until it either decays or is excreted by the body. 

'Therefore, to limit patient dose, relatively small amounts of activity are 
usually injected, typically ranging from 100 to 1,000 MBq. The activity is then 
distributed throughout the body, leading to low numbers of emitted photons 
at any given area. The imaging task is thus similar to taking a photograph 
in a very dark room. A long exposure time can yield a better SNR, but 
comes with problems of motion blur and patient discomfort. A typical SPECT 
projection lasts 15 seconds, resulting in a total scan duration of 30 minutes 
for the usual 120 projections! Despite this effort, projections typically have 
only about 20 or fewer useful photons per pixel in diagnostically interesting 
areas. A representative projection from a skeletal SPECT scan in shown in 
Fig. 10.4. The mean pixel value is a measly 0.6, and maximum is only 17, 
significantly less than even the noisiest simulation in Fig. 10.3(d). Due higher 
scanner sensitivities, PET statistics are slightly better, with roughly a factor 
of 10 more counts per pixel at typical scan durations of 4-6 minutes for an 
equivalent field of view. 

'The fundamental challenge in emission tomography is therefore to pro- 
duce reconstructed images of the activity distribution v with acceptable im- 
age quality from noisy acquired data. The following sections describe other 
physical issues encountered as well. 


10.2.2 Photon Interactions 


Aside from the fundamental problem of noisy data, the second most im- 
portant physical factor affecting emission tomography is photon attenuation. 
Photons traveling through a medium may interact with atoms and eventually 
be absorbed, resulting in a detected flux J less than that originally emitted. 
In Chapter 8, we learned how to describe this principle using Beer's law and 
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Figure 10.5: A photon is deflected from its original path (vertical) by a 
scatter event and detected at an erroneous location to the left of its ideal 
position. 


exploit it for imaging. For transmission tomography like X-ray CT, this phe- 
nomenon is imaged directly to yield reconstructed images of the medium's 
(i.e. patient's) structure. This is possible because the location and current 
of the emitted flux Ig is known. In emission tomography, however, Tọ is de- 
termined by the activity distribution in the body v, which is unknown. At- 
tenuation is therefore a hindrance that leads to errors if it is not accounted 
for. 

Amongst the photon-matter-interactions, Compton scatter is very impor- 
tant for emission tomography (cf. Sec. 7.3). In this interaction, the photon 
is not absorbed as in attenuation, but merely deflected. The relationship be- 
tween deflection angle 0 and pre- and post-collision energies Ey and Escat is 
described by the Klein-Nishina formula: 


Eo 


1+ (Z9)(1— cos 0) me) 


scat — 


Scatter is an important component of emission tomography due to its role 
in the degradation of image quality. Specifically, deflected photons may be 
erroneously assumed to come from locations in the image volume along their 
scattered trajectories, rather than their original paths. This process is illus- 
trated in Fig. 10.5, where a photon originating in the image is scattered and 
counted at a detector pixel corresponding to a trajectory other than its orig- 
inal (vertical) path. This has the effect of reducing resolution, contributing 
to image noise, and reducing contrast. 
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Figure 10.6: Simplified schematic representation of a gamma camera show- 
ing three primary components. 


10.3 Acquisition Systems 


10.3.1 SPECT 


Early methods for detecting photons emitted from radiotracers focused on 
scanning probes (e.g. Geiger counters) over the patient. Scanning a field of 
view of any reasonable size was therefore a painstaking process, and 3-D re- 
construction was out of the question. In 1957, Hal Anger solved this problem 
with the invention of the gamma camera, shown schematically in Fig. 10.6. 
A classical gamma camera consists of three components: a collimator, a scin- 
tillator, and an array of photomultiplier tubes (PMTs). 

The collimator is composed of a lattice of holes separated by septa made 
of some highly attenuating material (usually lead). Its role is to restrict the 
angle of acceptance at each point on the detector surface and provide an 
(ideally) parallel projection of the object being imaged onto the scintillator. 
In e.g. optical imaging equipment, this is usually accomplished by means 
of a small aperture known as a pinhole. For this reason, collimators with a 
parallel-hole geometry consisting of a large array of narrow, parallel bores 
are the most commonly used type for SPECT imaging..? 

Ideally, a point source placed in front of the detector would yield a perfect 
point in the image. However, the bores of a collimator are neither infinitely 
long nor infinitely narrow, leading to a finite acceptance angle that allows pho- 
tons traveling from the point to reach the detector via a range of rays about 
the ideal one (i.e. the shortest path from point to detector). The structure 
of these alternate paths is described by the collimator’s PSF and effectively 


? Advanced reconstruction algorithms can take advantage of the benefits of non-parallel 
projection methods, provided they are accomplished by means of multi-hole collimators 
(fan-beam, convergent, divergent). 
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Figure 10.7: Schematic representation of collimator and PSF (yellow). The 
acceptance angle of a bore is dependent on bore length and width, leading 
to a widening of the PSF with depth. 


blurs more complicated objects being imaged, which can be thought of as 
collections of many points. 

This effect can be seen in Fig. 10.7, which shows a schematic representation 
of a trio of 1-D parallel collimator bores in front of a detector. A virtual point 
source placed at the intersection of the red arrows would be able to reach the 
detector along a number of rays. Photons reaching the detector on direct 
paths through air are termed geometric, because their PSF is only a function 
of the detector and collimator geometry. On an infinitely precise detector, 
the resulting response would be an array of indicator functions, but due to 
pixelation in the acquired image and other factors, the PSF has a roughly 
conical shape. 

In many applications it is modeled as a Gaussian, and the resolution is 
characterized by the full width at half maximum (FWHM) rpsr, which may 
be approximated by the following equation: 


TPSF © PROMISE (10.5) 
Ly 

where Dy is the bore diameter, Lẹ its length, and z the distance between the 
source plane and the face of the collimator. From Eq. (10.5), it can be seen 
that the resolution is depth-dependent and becomes wider with increasing z 

for given collimator dimensions. 
An image of a point source showing a true PSF is shown in Fig. 10.8. 
The image is saturated to highlight the complex structure. In the bright 
central area outlined in red, primarily geometric photons are present. In the 
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Figure 10.8: Measured PSF from a ?9""T[c point source imaged at a distance 
of 10 cm, shown saturated to emphasize low-intensity regions. The geomet- 
ric bright geometric is outlined in red, and most extra-geometric counts lie 
between the red and blue hexagons, where a single partial septal wall is pen- 
etrated. 


region immediately adjacent to it outlined in blue, photons passing through 
a portion of a single septum are detected. The long “spider”-like legs are 
due to septal penetration across multiple walls, which is most probable in 
a direction perpendicular to the edges of the hexagonal collimator bores. A 
faint background between these streaks is caused by Compton scattering in 
the collimator and contaminates the entire function. The magnitude of the 
spider legs is up to 1.5% of the maximum PSF value, and for ?9""Tc up to 
1096 of photons may be extra-geometric and thus not accounted for by ideal 
models. Therefore, some in the field have begun to use PSF models based on 
measured true data rather than ideal calculations. 

Issues of resolution and septal penetration are important when designing 
a collimator. The collimator efficiency p is also significant, as it describes the 
ratio of geometric photons passed through the collimator to the total number 
emitted towards it. It is ideally constant over z for the parallel hole case and 
can be approximated as 


DM D 
zx K? b 10. 
p (2) D, +T (10.6) 


where T is the width of the septal wall and K is a constant based on hole 
geometry. A typical value of p is on the order of 1074, making it a key, but 
necessary, limiting factor in the sensitivity of SPECT systems. 

In Eq. (10.6), it can be seen that p increases as bores are either shortened 
or widened. However, from Eq. (10.5), we see that these changes decrease res- 
olution. Taking Eq. (10.5) and Eq. (10.6) together, it becomes apparent that 
the task of collimator design is a compromise between collimator sensitivity 
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and resolution. The former directly impacts the quality of counting statis- 
tics, and therefore noise, in an acquired image. The latter is related to the 
accuracy with which the detector can localize them and properly reproduce 
small features such as edges. A third consideration appears via the septal 
thickness which, when increased, limits the star artifacts shown in Fig. 10.8 
at the expense of smaller p. 

Once a photon has passed through the collimator, it impacts the system’s 
scintillator (typically composed of Nal), releasing several lower energy pho- 
tons in the visible range. These photons then travel to the PMTs, where 
they initiate an electron avalanche that is detected as a current signal at the 
PMT output. To determine the 2-D location of a photon, a type of centroid 
is computed by the output electronics of the PMT array in a process known 
as Anger Logic, after its inventor. In the 1-D case, the estimated location of 
the photon detection ĉ can be calculated as 


p TqGq 
233 Gq 
where G, and xq are the signal strength at and location of the q-th PMT. 
Applied in this fashion directly, images will suffer from nonuniformities and 
pincushion distortions. These are removed by replacing Gg with some non- 
linear function thereof. Even after this correction, the method is not exact, 
and the resulting finite resolution rppr adds in quadrature with that of the 
PSF to yield a total system resolution rsys = V pgp + rpg. Another impor- 
tant property of the PMT output is that the value of 57 å G4 is proportional 
to the energy of the initial photon. This allows SPECT cameras to be energy 
resolving as well, allowing the effects of Compton scatter to be mitigated. 


$-— 


(10.7) 


10.3.2 PET 


As shown in Fig. 10.1, the 8 decay that forms the basis of PET produces 
two photons that travel in opposing directions away from each other. This 
is exploited for imaging purposes by using a ring detector and looking for 
coincidences in the observed data. This coincidence detection principle is 
illustrated in Fig. 10.9, where a PET ring composed of many small detector 
blocks is shown. Extremely high speed electronics monitor each detector's 
output signal and record a detection event when two impulses are detected 
simultaneously. The detector blocks themselves are traditionally composed of 
a scintillator crystal mated to a small PMT array as with the Anger camera. 
However, no collimator is needed to restrict the scintillator's acceptance angle 
in this case because the photon's incidence angle is implicitly provided by the 
detector block at the opposite side of the ring. Nevertheless, inaccuracies in 
the scintillator blocks and PMTs still induce a finite PSF in PET whose 
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Figure 10.9: PET ring detector and coincidence detection principle. The 
detector electronics simultaneously monitor signals from each detector block 
and record counts when an impulse is detected from two blocks at the same 
time. 


geometrical properties vary widely depending on the source’s location in the 
field of view. 

The ray connecting the two detection points (red line in Fig. 10.9) is known 
as the line of response. Integrating along all parallel lines of response at a 
particular rotation angle will produce a row of a sinogram at that angle that 
can be used for reconstruction. Early PET systems treated each axial ring 
of detector blocks as independent slices and thus ignored lines of response 
with oblique axial angles. This strategy, shown in Fig. 10.10(a), reduces the 
computational burden on detector electronics (coincidences from fewer blocks 
must be monitored simultaneously), but sacrifices many counts. 

Newer systems utilize a 3-D detection configuration (cf. Fig. 10.10(b)), 
where lines of response across a finite axial range are observed. This provides 
an increase in sensitivity due to the fact that, for a given source location, 
counts can be registered at a greater number of detectors. However, by the 
same token, it is more probable that false (random) coincidences or pairs of 
scattered photons will be detected. Also, an extra step of axial rebinning is 
needed to produce a sinogram. Spatial and Fourier domain strategies exist, 
but the common goal is to transform the acquired oblique lines of response 
into approximate virtual lines of response perpendicular to the axial direction. 

PET has a number of advantages over SPECT due to more favorable 
physics. Sensitivity is roughly an order of magnitude higher due to the absence 
of a collimator, and the ring detector offers better tomographic consistency 
(i.e. all angles are acquired simultaneously). Furthermore, the reconstruc- 
tion problem is better defined than with SPECT due to the (ideally) 1-D 
search space along each line of response. Mathematically, this translates into 
a system matrix that is better conditioned. By using TOF information de- 
rived from slight delays between coincidence detections, the range of possible 
emission locations can be even further reduced. 
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Figure 10.10: 2-D (left) and 3-D (right) detection configurations for PET. 
The latter offers better sensitivity at the expense of more scatter events. 


TOF PET systems with 3-D detection thus typically offer superior res- 
olution and noise characteristics compared to SPECT, but this comes at a 
price. SF, the most common isotope used in PET, has a half life of only 110 
minutes and is more difficult to produce than ?9""Tc, requiring a complex 
logistical network to minimize the time between production and injection. 
Furthermore, the higher energy photons imaged in PET require costly ex- 
otic scintillator materials. This, combined with highly specialized detector 
electronics, makes PET systems more expensive to procure and operate than 
their SPECT counterparts. 


10.4 Reconstruction 


10.4.1 Filtered Back-Projection 


In Chapter 8, we presented the filtered back-projection method of reconstruc- 
tion in the context of X-ray CT. The advantages of this reconstruction are 
its speed and simplicity, as well as reconstructed image properties, such as 
resolution, that are relatively easy to determine. However, while filtered back- 
projection works quite well for high-count data, it fails to take into account 
the Poisson statistics outlined in Sec. 10.2.1. This leads to very noisy images 
in SPECT and PET, where detected counts are several orders of magnitude 
lower than those seen in CT. 

Furthermore, filtered back-projection operates by inverting the Radon 
transform — the purely geometrical relationship between voxels in the im- 
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age and their projected pixels at the detector. This ignores all of the other 
physical factors, such as attenuation, scatter, and the PSF, that play a vital 
role in the emission tomography image formation chain. This oversight leads 
to artifacts in reconstructed images that greatly degrade image quality. For 
these reasons, filtered back-projection is generally no longer used in clinical 
situations. 


10.4.2 Iterative Reconstruction 


In order to improve the noise performance of filtered back-projection, we 
must use the statistical relationship in Eq. (10.2) between the mean activity 
distribution in each voxel and the observed counts at the detector. Filtered 
back-projection implicitly assumes a deterministic relationship, but we can 
take stochastic effects into account by using the definition of the Poisson mass 
distribution function. 

Geek Box 10.1 describes how probable observed detector data are given a 
set of model parameters, which take the form of a vector of Poisson means v 
for each voxel in our case. Obviously, in emission tomography, these param- 
eters are unknown. However, the likelihood function provides us with a tool 
to estimate them by searching for the set of £* that maximizes P(D = d) 
and thus yields the most likely estimate given our data: 


bD* = argmax P(D = d). (10.8) 


The relationship described in Geek Box 10.1 is quite complex, and it is not 
immediately clear how to maximize the likelihood. However, a seminal paper 
by Shepp and Vardi in 1982 showed that this can be accomplished via the Ex- 
pectation Maximization (EM) algorithm, whose general framework involves 
the estimation of the “complete” information, given a set of observations and 
hidden, “latent”, information. Although a detailed description is outside the 
scope of this text, it is worth outlining that for emission tomography, the 
complete information is the actual emission distribution v, and the observa- 
tions are the counts in the projections d. The latent information is comprised 
of all of the photons originating in the image that escape detection. 

As shown by Shepp and Vardi, EM’s methodology of alternatingly forming 
a conditional expectation via marginalization over a particular variable and 
then maximizing the resulting likelihood provides a convenient framework for 
attacking Eq. (10.8) as encountered in emission tomography. This expecta- 
tion/maximization cycle is repeated until a suitable image is obtained, and 
each one of these repetitions is referred to as one iteration k. The algorithm 
begins by initializing some estimate of the activity distribution 0°. The pro- 
cess proceeds at each iteration by forward projecting the current estimate p*. 
comparing it to the measured data, backprojecting the result, and applying 
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Geek Box 10.1: Total Likelihood Function 


a) For the simple case of counts from one voxel being emitted directly 
into a single pixel detector, the probability of a particular observation 
d given the true mean rv is 


ya 
—v 
Jg (D = d) =e d 
which is known as the likelihood of the observation. 
b) Moving one step further, where an array of observations d if formed 
by photons emitted from a vector of voxels with means v. This is the 
same scenario we examined in the example in Fig. 10.3. Here, the 
system matrix is equivalent to the identity matrix A — I, and each 
voxel contributes to a single detector element. As each observation is 
independent of the others, we can multiply them together to obtain 


our likelihood: " 


= [Jerch 


where 7 represents the index of the detector and image elements, which 
are equivalent in this case. 

c) In a true imaging scenario, A # I, and multiple image voxels 
contribute to a single detector element. To account for this, we must 
subdivide the total detected counts in each pixel dm into contributions 
from each image voxel: dm = >>, dm(n). The probabilities contained 
in the system matrix must also be included. The total likelihood is 
therefore the sum over each of these possible scenarios: 
C 


dm(n)! 


P(D=d)= II exp (—Vmam,n) 


m,n 


The dual product has the effect of incorporating the contribution from 
each voxel to each pixel. 
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a weight to the estimate to create a new ot! The update mechanism for the 
algorithm can be expressed using the following equation: 


Ak d 
pH — EN X te =. (10.9) 


Collectively, this method is known as the maximum likelihood expectation 
maximization (MLEM) algorithm and is widely used in many commercial 
and research applications. 

The iterative reconstruction process for MLEM thus consists of an objec- 
tive function that describes the quality of the current estimate (the likelihood 
function) and a way of optimizing it (EM). Within the field of image recon- 
struction, a wide range of objective function/optimizer pairs are available. 
Another objective function that has found wide use is weighted least squares: 


b^ = argmin|d — A£|2, = argmin V ^ wm (dim — [AD] m)”, (10.10) 


where [A£],, is the m-th pixel of the forward projected estimate and w is 
a vector of weights. The weights are often used to take noise into account 
by, for example, setting each element of w equal to an estimate of the vari- 
ance at the corresponding detector pixel. This has the effect of adjusting 
each pixel's contribution to the objective function depending on its noise 
properties. The weighted least squares objective function is convex and can 
be solved with gradient-based optimization techniques such as the conjugate 
gradient method. 


10.4.3 Quantitative Reconstructions 


Although iterative reconstruction is motivated by the underlying statistics of 
photon emission, another major advantage is its ability to model the physics 
of the imaging system. This is accomplished via the system matrix. In ad- 
dition to geometric information, the system matrix can include the effects 
of attenuation and scatter to allow the reconstruction to correct for them. 
Furthermore, resolution lost due to PSF blurring may be regained to some 
extent if this is modeled as well. 

Aside from image quality improvements such as contrast enhancement and 
noise reduction, proper system modeling enables PET and SPECT systems to 
become quantitative as well. In other words, instead of reconstructing images 
in arbitrary or relative units, absolute units such as activity concentration 
kBa are produced. This important distinction allows scans across different 


ml 
patients, scanners, and time points to be meaningfully compared. This is 
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not only useful for individual patient management, but enables larger, multi- 
center clinical studies as well. 

Assuming an accurate system model is available, the cornerstone of a quan- 
titative imaging system is the calibration. This anchors the counts observed 
during an acquisition to a physical amount of radioactivity in the detector’s 
field of view. A common way of doing this is to perform an acquisition on 
a homogeneous phantom with a known activity concentration expressed as 
KB A volume of interest may then be defined in the reconstructed image, 
and a count density in units of counts per ml may be determined. Time must 
then be taken into account by correcting for decay and normalizing by the 
acquisition duration. After these steps, a volumetric sensitivity factor ovyor, 
may be defined relative to its units as follows: 


counts 
— minute ml 
QVOL = ^ kBq ^ (10.11) 
ml 

With this factor in hand, subsequent acquisitions may be quantified, provided 
they are acquired with the same isotope and reconstructed in the same way. 
'The procedure for this is straightforward and consists of obtaining the count 
rate density from a volume of interest in units of feum and dividing by 
OG vor, thus producing the desired absolute units of ma 

This solution is not without drawbacks. It requires the filling of a phan- 
tom for calibration and is vulnerable to errors and inconsistencies that come 
from user-defined volumes of interest. A more elegant method is to utilize a 
planar sensitivity &planar = Eee This value is then incorporated into the 
system matrix to relate the forward projected counts to absolute activity in 
the reconstructed volume. The result is a reconstruction that is inherently 
quantitative and dependent on a calibration factor that can be obtained from 
less tedious planar acquisitions of a point source. 

In the medical community, it is also of interest to normalize for factors 
such as patient weight and injected dose. The commonly used Standardized 
Uptake Value (SUV) is an example of this. It is based on the assumptions 
that a) a tracer in healthy tissue will distribute uniformly throughout the 
body and b) that the body has a uniform density equal to that of water (i.e. 
1 kg/l). Combining these assumptions yields the following relation: 


kBayvor 


_ __mlvor 
SUV a ke’ (10.12) 
where the subscripts VOI and INJ refer to quantities drawn from a recon- 
structed volume of interest (e.g. a region surrounding a suspect tumor) and 
total injected dose, respectively. 
Despite the somewhat unintuitive units of i. the logic behind the SUV 
is sound: a value significantly greater than unity indicates a disproportionate 
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Figure 10.11: 1-D signal (green) convolved with gaussian to yield “observed 
data” that is reconstructed (i.e. deconvolved) using the MLEM algorithm 
(blue curve). In this case, the blurring function is not modeled, and the 
reconstruction cannot improve upon the observed data (the two curves are 
equal). Figure courtesy of Siemens Molecular Imaging Inc., USA. 


amount of uptake and a potential abnormality. This is particularly the case 
for tracers where assumption (a) from above holds. Furthermore, by normal- 
izing for two factors that vary across acquisitions (injected dose and patient 
weight), the SUV allows for easier comparison between different patients and 
time points. 

Numerous variations on the SUV exist. One of the most popular is the 
SUV maz, which simply places the maximum activity concentration found in 
a volume of interest in the numerator of Eq. (10.12) to guard against partial 
volume effects. Other extensions normalize by lean body mass or body surface 
area to better account for anatomical variations. 


10.4.4 Practical Considerations 


Although superior to analytical methods, iterative reconstructions are not 
without their own complications. Namely, the inclusion of a system model 
and optimization scheme adds a plethora of parameters that must be tailored 
to the imaging task at hand. Poor judgment in selecting these values may 
degrade image quality. 

To illustrate this concept, we use a simple 1-D signal with two step func- 
tions blurred by a Gaussian. The original signal represents the truth, and its 
blurred version our observed data. If we initialize a constant function and ap- 
ply the MLEM algorithm from Eq. (10.9), we can attempt to “reconstruct” 
the truth from the data. In Fig. 10.11, the results are shown for the case 
where the blurring function is not modeled (similar to an emission tomogra- 
phy reconstruction without PSF compensation). As expected, the best our 
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Figure 10.12: Reconstruction after six MLEM iterations where the blur- 
ring function is modeled in the system matrix. Edge resolution is improved 
over Fig. 10.11, but ringing artifacts also become visible. Figure courtesy of 
Siemens Molecular Imaging Inc, USA. 


Figure 10.13: MLEM reconstruction after 300 iterations. The edges have 
become sharper, and artifacts amplified relative to Fig. 10.12. Figure courtesy 
of Siemens Molecular Imaging Inc., USA. 


method can do is to adjust the constant initialization until it matches the 
blurred observations: the two curves are identical. 

Fig. 10.12 shows six MLEM iterations with the blur incorporated into 
the system matrix. This is equivalent to adding a deconvolution problem 
to our reconstruction, and we see that the edges have become sharpened 
as frequencies suppressed by the blur are recovered by the reconstruction. 
However, ringing artifacts have also become visible due to the fact that the 
original spectrum is only partially recovered. The results after 300 iterations 
are shown in Fig. 10.13, where even better edge resolution is achieved, albeit 
with more severe ringing as well. 

If we incorporate Poisson noise into the observed data, we can make our 
experiment more realistic. How does this change the results? The reconstruc- 
tion after six iterations shown in Fig. 10.14 indicates that they are broadly 
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Figure 10.14: Six MLEM iterations with on data perturbed with Poisson 
noise. The result is slightly irregular but comparable to Fig. 10.12. Figure 
courtesy of Siemens Molecular Imaging Inc., USA. 


Figure 10.15: 300 MLEM iterations on noisy data. The interior of the large 
object is highly irregular, and quality is noticably worse than Fig. 10.13. 
Figure courtesy of Siemens Molecular Imaging Inc., USA. 


comparable to the noiseless case in Fig. 10.12, although slight irregularities 
inside the wide object can be seen. However, the case after 300 iterations 
shown in Fig. 10.15 is starkly different from its noiseless counterpart, with 
the interior of the wide object becoming very inhomogeneous. 

The noise in the reconstructed signal in Fig. 10.15 is a result of the ill- 
conditioned nature of the reconstruction problem and can be generalized to 
the case of emission tomography. During early iterations, low frequencies are 
recovered that correspond mostly to signal information, such as high-contrast, 
large objects. However, at higher iterations, the algorithm turns its attention 
to higher frequencies where the signal and noise energies are comparable. The 
result is an overfitting of the noise and degradation of image quality. 

By iterating further, we can increase resolution and thus reduce quanti- 
tative bias due to edge roll-off. However, as seen in Fig. 10.15, this runs the 
risk of introducing too much noise into the image. The use of image post- 
smoothing or smoothness regularization can reduce noise while sacrificing 
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some resolution and thus entails making the same type of compromise. In 
practice, the choice of many reconstruction parameters is therefore another 
example of the bias/variance trade-off already discussed above with respect 
to SPECT collimator design. 

Another complication for iterative reconstruction in emission tomography 
is that there are source-dependent factors such as attenuation and scatter in 
the system matrix. This implies that the properties of the reconstruction will 
vary from patient to patient, even if the same acquisition and reconstruction 
parameters are used. Furthermore, depth- and position-dependent PSFs in 
SPECT and PET lead to shift-variant properties within a given image as 
well. These factors should be taken into account when reconstructing and 
interpreting images. 


10.5 Clinical Applications 


Molecular imaging is used in various fields of medicine such as neurology, 
oncology, cardiology, and orthopedics. Its application areas can be broadly 
subdivided into two fields: diagnostics and therapy. 


10.5.1 Diagnostics 


As the most common use for emission tomography, diagnostics is also the 
most diverse. In the field of neurology, both PET and SPECT offer perfusion 
tracers that give physicians insight into the amount of blood flow in the 
brain during the scan, which is proportional to brain activity. An example 
of a SPECT brain perfusion procedure using ?9?""Te-Ethylcysteinat-Dimer 
(ECD) is shown in Fig. 10.16(a). An epileptic patient is scanned immediately 
following a seizure and during a neutral state. The reconstructed images are 
subtracted and fused with an MR of the patient's brain to localize the focus 
of the seizure. More specialized applications including imaging of amyloid 
plaques linked to Alzheimer's disease (PET) and dopamine receptor imaging 
(SPECT) are also available. 

For oncology, ‘SF, the most commonly used PET isotope, may be bonded 
to a molecule in the glucose family resulting in so-called fludeoxyglucose 
(FDG). Using these FDG-PET scans, doctors can search for areas with 
high glucose metabolism — a sign of rapidly growing metastatic tumors. 
Fig. 10.16(b) shows an FDG-PET scan of a patient with melanoma. Ma- 
lignant metastases are visible below the liver and beside the heart. A com- 
mon oncological use for SPECT is skeletal imaging with 99""Tc bonded to 
phosphorous compounds. High uptake of these tracers is often indicative of 
secondary lesions from e.g. prostate or breast cancer. 
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Figure 10.16: Examples of diagnostic procedures in molecular imaging. a) 
Differential SPECT scan using ?9""Tc-ECD to localize seizure epicenter. b) 
FDG-PET scan for a patient with melanoma. Several small lesions are visi- 
ble below liver and beside heart. Images courtesy of the University Hospital 
Erlangen, Clinic of Nuclear Medicine. 


In addition to oncology, bone SPECT is also used in the field of orthope- 
dics to localize and diagnose the source of pain felt by patients with faulty 
prosthetics, small fractures, or degenerative disease. PET and SPECT also 
both offer myocardial perfusion tracers, which allow cardiologists to assess 
the viability of the heart muscle and diagnose various heart diseases. 

Using quantitative imaging, physicians are also able to monitor disease 
over time. By comparing metrics such as SUV at scans taken at different time 
points, they can track the progression of e.g. metastatic lesions and better 
assess response to therapy. An example of this is shown in Fig. 10.17, where 
a breast cancer patient was imaged with ?9""Tc-labelled 3,3-diphosphono-1,2- 
propanodicarboxylic acid (DPD) at three different time points, roughly six 
months apart. In the first scan (cf. Fig. 10.17(a)), a region was seen in the 
skull with uptake suspicious of a metastatic bone tumor. In a follow up study, 
the SUV mar was seen to increase, and treatment with bisphosphonates was 
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(a) March, 2012; SUV maz = 4.5 


(b) September, 2012; SUVmaz = 6.0 


( ] 


(c) February, 2013; SUVmaz = 5.1 


Figure 10.17: Same breast cancer patient imaged on three different dates 
with ?9""Tc-DPD. The calculated SUV max from the volume of interest at the 
posterior right area of the skull is also shown. A decrease in uptake was noted 
between the second and third scans, indicating a response to therapy. Images 
courtesy of the University Hospital Erlangen, Clinic of Nuclear Medicine. 


begun. In the final scan shown in Fig. 10.17(c), SUV mar decreased, indicating 
a response to therapy. 
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10.5.2 Therapy 


In addition to purely diagnostic imaging, emission tomography plays an in- 
tegral role in radioisotope therapy as well. Such procedures utilize the tracer 
principle to target malignant tissue with radiation. This radiation then elim- 
inates or stems the growth of unwanted cells. However, these positive effects 
must be weighed against the negative side effects on healthy tissue. To ac- 
complish this, physicians must estimate the dose of a course of therapy on 
sensitive organs. 

This process, known as dosimetry, is quite complex. It relies on quantify- 
ing the activity distribution within the patient, determining how long it will 
remain there, and estimating how much energy will be deposited in healthy 
tissue. As therapy agents typically involve higher energy emissions and more 
complicated spectra, the system matrix in such cases becomes more diffi- 
cult to define. Furthermore, post-processing such as organ segmentation and 
biological modeling become necessary. 


10.6 Hybrid Imaging 


10.6.1 Clinical Need 


SPECT and PET offer excellent sensitivity for the detection of disease due 
to the functional information they provide. However, pathological regions of 
an image may be difficult to localize in the body in the absence of structural 
information. 

Take, for example, a hypothetical surgeon who is planning a biopsy and 
needs to find the specific Sentinel Lymph Nodes (SLNs) draining a tumor in a 
breast cancer patient. Prior to surgery, a SPECT scan has clearly shown the 
presence of an SLN with high uptake in the underarm area, but it is known 
that there are multiple possible lymph nodes here. This stand-alone SPECT 
might appear similar to the left pane of Fig. 10.18, where only a single bright 
spot is visible. How will the surgeon proceed? 

Historically, during planar acquisitions, a technologist might trace the out- 
side edge of a patient's body with a radioactive “pen” to provide a rough 
anatomical point of reference in the image. The advantages of this method 
are limited, and it is, in any case, not possible for SPECT, where attempts 
may be made to register a previously acquired CT to the current SLN SPECT 
study. However, the human anatomy is non-rigid, and shifts in posture and 
time between scans may lead to errors. Our surgeon would therefore be left 
with the option to operate in the general area of the suspected SLN and rely 
on tedious scanning with gamma counting probes to find the exact node. 
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SPECT/CT 


Figure 10.18: SPECT data after labeling of a Sentinel Lymph Node (SLN) 
with 9°" Tc-Nanocoll. The corresponding structural information from a com- 
plimentary X-ray CT scan helps provide proper localization of the activ- 
ity. Images courtesy of the University Hospital Erlangen, Clinic of Nuclear 
Medicine. 


10.6.2 Advent und Acceptance of Hybrid Scanners 


In 2000, David Townsend and Ronald Nutt of the University of Geneva, 
working together with CTI (now a division of Siemens Molecular Imaging), 
introduced the first hybrid PET/CT scanner. This device offered a PET ring 
detector and multi-slice spiral CT scanner integrated into the same gantry. 
Patients could therefore receive PET and CT studies in quick succession with- 
out moving, greatly reducing registration errors and providing both structural 
and functional information in one fell swoop. Six years later, Siemens Molec- 
ular Imaging introduced the Symbia SPECT/CT scanner, bringing the same 
advantages to the field of SPECT. Other manufacturers quickly developed 
similar hybrid imaging systems as well. 

With the advent of hybrid imaging, our hypothetical surgeon can now use 
the CT acquired with the SPECT study to pinpoint the location of the SLN 
prior to surgery, reducing both the time needed to perform the operation and 
the risk of misidentification. This is illustrated in the center and right panes 
of Fig. 10.18. In the center, a CT acquired immediately after the SPECT to 
the left is shown, and in the right, the two fused datasets are displayed. As the 
patient was lying on the same SPECT /CT gantry in the same position during 
both acquisitions, the surgeon can be sure of the accuracy in the registration 
between the two datasets. 

The integration of PET or SPECT system with an X-ray CT scanner 
represents a complex engineering task. On the hardware side, care must be 
taken to ensure that the physical mating of the two systems does not affect 
their individual performance. On the firm- and software side, separate data 
transmission protocols, formats, and user interfaces must be unified as well. 
After the devices are physically complete, system engineers must work with 
others to develop new calibration and quality control routines. These might, 
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for example, provide the reconstruction software with updated transforma- 
tions between the SPECT and CT coordinate systems, as these parameters 
vary over time due to parts wearing down or being replaced. 

In addition to the clinical benefits of hybrid imaging, the reconstruction 
process itself can be improved as well. In Sec. 10.4.3, we briefly discussed 
the importance of CT attenuation correction for emission tomography. The 
advent of hybrid devices has made these corrections the standard in most 
clinics rather than a research topic, reducing attenuation artifacts and paving 
the way for quantitative imaging. 

Hybrid PET and SPECT /CT devices represented a major step forward in 
medical imaging. However, CT as a structural modality has its own weak- 
nesses. Namely, soft tissue contrast in regions of interest for molecular imag- 
ing, such as the liver and brain, is poor. Also, the presence of extra radiation 
dose from the CT is obviously undesirable. For this reason, MR imaging was 
proposed as a structural imaging modality for use in hybrid scanners. 

Although clinically exciting e.g. for neurology applications due to MR’s 
unparalleled contrast between different brain tissues and PET’s array of sen- 
sitive neurological tracers, the mating of PET and MR represented a host of 
new physical challenges. The most important of these was how to eliminate 
PMTs from the design, which are unusable in MR’s strong magnetic fields due 
to the interference they induce on a PMT’s moving electrons. Engineers were 
able to overcome this by substituting the standard scintillator-PMT setup 
with semiconductor detectors that convert photons directly to image data. 
However, another issue is how to derive attenuation maps from the MR data, 
which does not have the same direct physical meaning that CT’s Hounsfield 
units have and therefore must be processed further to obtain a u-map. In this 
case, pattern recognition methods may be used to estimate the density map 
based on atlas data and segmentation/classification of the patient’s tissue. 

Having overcome these and other issues to a large extent, beginning in 
2011, each of the major manufacturers has released a commercial PET/MR 
system. Much research is currently being performed to both improve their 
performance and define new clinical applications. 


10.6.3 Further Benefits of Hybrid Imaging 


In addition to incorporating the u-map into the system matrix to improve 
the physical model of the projection process, CT or MR information may be 
integrated into the reconstruction in other ways as well. We could assume, 
for example, that a sharp boundary in an MR brain image should have a 
correspondingly sharp boundary in the nuclear image because we expect that 
the radioactivity concentration across two types of tissue (e.g. white and 
gray matter) will be discontinuous. As the low resolution SPECT or PET 
reconstructions are not capable of reproducing this high resolution on their 
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Geek Box 10.2: Maximum a posteriori Estimation 


In order to understand the idea of MAP estimation, we have to un- 
derstand that our model, i.e., the Poisson distribution, introduces 
conditions on our probability. Thus P(D = d) is actually conditioned 
by the Poisson means v. As such, we actually need to denote it as 
P(D = d|v) or P(d|v) in short. Next, we realize that we are actually 
interested in P(v|d), as d is observable in our case and we seek to 
maximize the probability of v given d. Fortunately, Bayes’ rule applies 
for conditional probabilities: 


P(d|v)P(v) 


Pd (10.13) 


£* = argmax P(v|d) = argmax 


where P(d|v) is known from physics, P(d) is independent of the op- 
timization and can therefore be neglected, and P(v) is the prior term. 
Now P(v) is independent of the actual observation d and can there- 
fore be used to model any prior knowledge on the distribution of v. For 
hybrid applications, P(v) is chosen based on the CT or MR informa- 
tion. Of course, MAP methods may also be used with purely PE'T or 
SPECT data to enforce e. g. smoothness, but their greatest potential 
benefit is the incorporation of information from other modalities. 


own, we could work this prior knowledge from MR into the objective function 
of an iterative reconstruction algorithm. 

The family of maximum a posteriori (MAP) algorithms is capable of doing 
exactly this by building upon the maximum likelihood method with a term 
representing some prior information known about the object. As the name 
implies, the MAP method seeks to maximize the posterior probability of the 
observed data given the distribution being imaged. We explain this principle 
in Geek Box 10.2. 

Another example of this higher level of integration, although one not re- 
lying on the MAP principle, is the xSPECT Bone algorithm from Siemens, 
which is currently used for reconstructing SPECT skeletal scans. This method 
works by segmenting the CT into several different tissue classes and forward 
projecting them separately at each iteration. In addition to voxel-wise image 
updates, the classes themselves are also allowed to be scaled independently 
while optimizing the objective function. This scaling allows the SPECT re- 
construction to have very sharp edges at the boundaries between tissue classes 
(e.g. if a cortical bone class has much more uptake than a neighboring lung 
region), while maintaining a typical SPECT-like resolution within each class. 

An example of this method is shown in Fig. 10.19 below. Note how the 
edges of the vertebrae in the xSPECT image (center) are much sharper than 
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Figure 10.19: MLEM (left) and xSPECT Bone (center) reconstructions. 
The latter achieves sharper resolution at the edge of tissue classes with the 
help of extra-modal CT data (right). Images courtesy of the University Hos- 
pital Erlangen, Clinic of Nuclear Medicine. 


the standard MLEM SPECT reconstruction (left) due to the boundary in- 
formation provided by the CT (right). However, the bladder appears very 
similar in the two SPECT reconstructions, as there is little additional CT 
boundary information here. 

Despite the advantages of methods such as MAP and xSPECT Bone, there 
are risks as well. For example, a MAP method may assume that bone den- 
sity is always positively correlated to tracer uptake and enforce this behavior 
to improve quantitative accuracy. This is indeed generally the case, but in 
the early stages of a bone infarction, there is little or no blood supply to 
the bone and, hence, no tracer uptake, despite a normal CT. Our example 
MAP algorithm would then try to allocate activity here during the recon- 
struction and potentially provide the physician with a false negative. One 
should therefore be very careful when designing priors, as they are generally 
based on assumptions about anatomy or biochemistry that may not be true 
in all cases. 
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11.1 Introduction 


Acoustic waves with frequencies € between 16 Hz and 20 kHz can be sensed 
by the human hearing and are thus called audible waves or audible sound. If 
€ > 20 kHz, one speaks of ultrasound (Tab. 11.1). Some animal species such 
as bats can perceive ultrasound and use it for echolocation: by measuring 
the time between sending and receiving (after partial reflection on a sur- 
face) ultrasonic waves, the distance of an object (e. g., a wall or prey) to the 
sender (bat) can be computed accurately, assuming that the sound velocity 
is known. In the previous century, modern technology started to make use of 
this technique with applications ranging from marine distance measurement 
(1920: SONAR) to medicine (1958: first ultrasound device in clinical use). A 
typical system is shown in Fig. 11.1. 

Today, medical ultrasound often is the first-resort clinical imaging modal- 
ity due to its cost-effectiveness and lack of ionizing radiation. Typical medical 
ultrasound frequencies are between 2 MHz < € < 40 Mhz. Traditionally, med- 
ical ultrasound is mainly put to use in diagnostic applications, however, more 
therapeutic applications are emerging. 


© The Author(s) 2018 
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Figure 11.1: Clinical Ultrasound System in action. Image courtesy of 
Siemens Healthineers AG. 


f Examples 
Infrasound 0... 16 Hz Seismic waves 
Music 


Audible sound 16 Hz ... 20kHz Human Speech 


Bat, Dolphin, and Whale Sounds 
Ultrasound > 20 kHz Acoustic Microscopy 
Ultrasound Imaging 


Table 11.1: Acoustic spectrum. 


11.2 Physics of Sound Waves 


This section introduces the basic underlying physics of ultrasound imaging. 


11.2.1 Sound Waves 


Acoustic signals emerge from organized movement of molecules or atoms, 
which cause local periodic compression of matter (gas, liquids, solid objects). 
Such spatially propagating, periodically repeating processes are commonly 
known as waves. Based on the direction of propagation, a distinction between 
transverse and longitudinal waves is made, where the nature of sound waves 
is the one of the latter class. 

Sound waves are mainly characterized by frequency, velocity, wavelength, 
and intensity. Frequency € is measured in Hertz (Hz) and denotes the oscilla- 
tion count per second. Sound velocity v within a medium, measured in meters 
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Medium |v [ns !]|Z [gcm7?s~"] 
Air 331 4.3 - 101 
Fat 1470 1.42 - 10° 
Water 1492 1.48 - 105 
Brain tissue 1530 1.56 - 10° 
Muscles 1568 1.63 - 105 
Bones 3600 6.12 - 10° 


Table 11.2: Sound velocity v and impedance Z of various media occurring 
in the human body. 


per second (ms~*), is independent of £, but varies with material properties 
such as elasticity and density. For some prominent examples see Tab. 11.2. 
'The wavelength A is the distance between two oscillation maxima and mea- 
sured in meters (m). Recall the fundamental wave equation relates wavelength 
À with sound velocity c and frequency £: 


À—-. (11.1) 
£ 
Finally, the intensity J of a sound wave is measured in Watts per area 
(W m^?) and denotes the acoustic power density. Typical values for J in 
ultrasound diagnostics are between 1 and 10 mW cm ?. 


11.2.2 Sound Wave Characteristics at Boundaries 


'The human body contains various kinds of boundaries between different ma- 
terials, for instance, at borders between organs and liquids or other tissues. 
At such boundaries between two media, sound waves are partially reflected 
and partially transmitted. 


11.2.2.1 Reflection 


The well known law of reflection states that the angle of incidence equals the 
angle of reflection. This also holds for reflection of sound waves (cf. Fig. 11.2(b)). 
For perpendicular incidence (cf. Fig. 11.2(a)), the reflection R and transmis- 
sion T' coefficients write: 


Jy 2, —Z\" 
Raae eL 11.2 
Jo (212) e 
putt. en (11.3) 


Jo (414+ Z2)? 
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Figure 11.2: (a) Reflected J, and transmitted J, wave intensity at border 
between two different materials with impedance Zı and Z2, respectively. (b) 
Reflection of sound waves at smooth surfaces (a1 = a2). 


Material 1|Material 2|Reflected 


Brain Skull bone 43.5% 
Fat Muscle 1% 
Fat Kidney 0.6% 
Muscle Blood 0.1% 
Soft tissue |Water 0.25% 
Soft tissue |Air 99.9% 


Table 11.3: Reflectivity at boundaries between two materials. 


where Jr, Ji, and Jo denote the wave intensity of the reflected, transmitted, 
and incident sound, respectively. Zı and Z2 denote the acoustic impedance of 
two different media. Acoustic impedance Z, which is measured in gcm? s^! , 
can be computed from the tensile modulus E (elasticity) and the density D 
of the given medium: 


Z= (E-D). (11.4) 


For some prominent examples see Tab. 11.2. 

From Eq. (11.2), it is interesting to see that for two media with equal 
impedance Z4, = Z2, no reflection happens. With similar impedance Z1 % Z2, 
as often occurring inside the human body at boundaries between similar 
types of tissue, the reflection coefficient R is rather small, while for |Z, — 
Z2| >> 0, e.g., at boundaries between air (low impedance) and soft tissue 
(high impedance), almost the entire wave is reflected (total reflection). The 
latter immediately leads to the conclusion that organs containing air, such as 
the lungs, cannot be examined via medical ultrasound. For more details see 
Tab. 11.3. 
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Figure 11.3: Scattering of sound waves at (a) a rough boundary (diffuse 
reflection) between two different media with impedance Z4 and Z2; and scat- 
tering at (b) inhomogeneities (depicted as blue dots) in a medium. 


11.2.2.2 Scattering 


Scattering means diffuse reflection of small portions of a wave in various di- 
rections. It should be noted that the law of reflection (see above) holds for 
each of those portions. Small inhomogeneities in the material cause scattering 
of sound waves, cf. Fig. 11.3(b). The same holds for boundaries with rough 
surfaces as shown in Fig. 11.3(a), where the width of the reflection cone in- 
creases with decreasing wavelength A and increasing roughness of the surface. 
Scattering at rough surfaces is highly relevant in medical ultrasound, because 
in the case of perfectly smooth boundaries, waves are only reflected towards 
the sender if the direction of the wave is perpendicular to the surface (no 
diffusion), whereas for rough boundaries, the reflections in various directions 
enable imaging of tilted boundaries. 


11.2.2.3 Diffraction 

When sound waves pass barriers, obstacles, or openings on their path, they 
get diffracted. Diffracion involves a change in direction of the sound wave. In- 
creasing wavelength A yields an increased amount of diffraction (sharpness of 
bending), and vice versa. If A is smaller than the size of the barrier, obstacle, 
or opening, the occurring diffraction becomes negligible. 


11.2.2.4 Refraction 


Snell's law of refraction known from optics states 
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€ [MHz] |dmax [cm]|Typical Applications 
1.0 50 |n/a 
3.5 15 |Fetus, liver, heart, kidney 
5.0 10 |Brain 
7.5 7 |Prostate 
10 5 |Pancreas (intraoperative) 
20 1.2 |Eye, skin 
40 0.6 |Intravascular 


Table 11.4: Maximum penetration depth dmax for various frequencies f. 


LM (11.5) 
vg  Sinog 
where o, and a2 denote the angle of refraction in two different media, also 
applies to sound waves. However, since sound velocities (v1, v2) in human soft 
tissue differ only marginally (see Tab. 11.2), the little effects of refraction in 
medical ultrasound are negligible and therefore not considered further in this 
chapter. 


11.2.3 Attenuation 


Attenuation is the reduction in sound wave intensity J that occurs when 
a wave penetrates a medium. It follows the well-known exponential law of 
attenuation: 


J(x) = Joexp (—px) , (11.6) 


where Jo denotes the initial intensity. The attenuation coefficient u denotes 
the attenuation that occurs with each cm the sound wave travels inside a 
medium. It depends on material (tissue type) and ultrasound frequency £ 
and is measured in decibel (dB). The attenuation coefficient mainly consists 
of two additive components 4 = fats, namely absorption Ha and scattering 
Hs (see above). Absorption Ha causes tissue to heat. 

From Eq. (11.6), it can be easily seen that the acoustic intensity J de- 
creases with increasing penetration depth x. For a high maximum penetra- 
tion depth, low frequencies are necessary as shown in Tab. 11.4. However, 
the resolution of the acquired images decreases with decreasing frequency 
(cf. Sec. 11.3.3). Thus, the deeper the tissue penetration, the lower the spa- 
tial resolution. 
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Electrode 


Figure 11.4: Piezoelectric effect. 
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11.3.1 Transducers 


An ultrasound transducer functions as both: a generator and a detector of 
ultrasonic waves. It converts mechanical energy into electrical energy and 
vice versa. When the transducer is pressed against the skin, it directs high- 
frequency sound waves into the body. Since sound waves produced by the 
transducer can barely penetrate air (cf. Tab. 11.3), gel is applied to the 
skin to help to minimize the amount of air between the transducer and the 
skin. As the waves penetrate the body, sound echoes are generated from 
the body’s fluids and tissues due to (diffuse) reflection and scattering. The 
strength and character of these sound echoes are recorded by the transducer 
and, depending on the type of transducer, can be transformed into 1-D, 2-D 
or 3-D images, which can be rendered and viewed to the user. 


11.3.2 Piezoelectric Effect 


In order to generate and detect ultrasonic waves, transducers rely on the 
so-called piezoelectric effect. It describes the conversion of electrical energy 
into mechanical energy and vice versa in piezoelectric materials. On the one 
hand, mechanical pressure (pressure translates to “piezo” (gr.)) is converted 
to electric polarization, which generates electric voltage. The electric voltage 
can be measured using two electrodes, as shown in Fig. 11.4. On the other 
hand, electrical fields cause contraction or stretching of the piezoelectric ma- 
terial. This contraction and stretching can be used to generate ultrasound 
waves by applying a high frequency alternating voltage. 

Typical piezoelectric materials used in medical ultrasound transducers are 
barium titanate (BaTiO3) and lead zirconium titanate (PZT). 
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Figure 11.5: Axial and lateral resolution of ultrasound devices. Minimal 
distance between two structures (blue dots) in axial/lateral direction that 
allows for distinguishing between them in the ultrasound image. 
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Figure 11.6: Axial resolution: illustration of dependence on wave frequency 
f. Left and right shows two different timesteps, where d denotes the distance 
between two structures (blue dots). The high frequency (top) wave allows 
for distinguishing between the two structures, as clearly separated echoes are 
sent back to the sender. Using the low frequency wave, the echoes cannot be 
separated (echoes are merged). 


11.3.3 Spatial Resolution 


In ultrasound imaging, a distinction is made between two different kinds of 
spatial resolutions, in particular axial and lateral resolution (cf. Fig. 11.5). 


11.3.3.1 Axial Resolution 


Axial resolution concerns structures lying behind each other w.r. t. the direc- 
tion of the ultrasound waves. The better the axial resolution, the smaller the 
distance between two structures can be such that they can be distinguished by 
the transducer. Axial resolution is highly dependent on the ultrasound wave 
frequency f. The illustration in Fig. 11.6 explains that dependency based on 
a simple example, where an ultrasonic pulse generated by the transducer con- 
sists in a single wave only (shortest possible pulse). The distance d between 
the structures needs to be d > A/2 in order to be able distinguish between 
them. 
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11.3.3.2 Lateral Resolution 


Lateral resolution concerns the distinguishability of structures located next 
to each other in the same lateral distance to the transducer (same penetration 
depth). Lateral resolution is always inferior to axial resolution. 


11.3.3.3 Frequency Trade-off 


As described above, axial and lateral transducer resolution depend on the 
ultrasound frequency £, and thus on the wavelength A. As a rule of thumb: 


axial: Az > A/2, (11.7) 
lateral: Ar~3-2r, (11.8) 


where Az and Az denote the minimum distance between to structures in ax- 
ial/lateral direction such that the ultrasound echo is distinguishable. Hence, 
high £ yields high resolution, whereas low £ yields low resolution. However, 
the frequency is also directly related to attenuation (cf. Sec. 11.2.3), where 
high £ yields high attenuation and vice versa. Thus, with high frequency f, 
the penetration depth is low but the images will have high resolution. At low 
£, deeper penetration is possible, but the resolution will be lower. Depending 
on the application, a trade-off between the desired properties (deep penetra- 
tion versus high resolution) needs to be found, and the transducer frequency 
be adjusted accordingly. 


11.3.4 Imaging Modes 


Ultrasound offers a large variety of different imaging modes. The most com- 
mon ones include A-mode, B-mode, and M-mode. A- and M-mode generate 
one-dimensional (1-D) images (signals), whereas B-mode can be used to ac- 
quire 2-D or even 3-D images (cf. Geek Box 11.1). Doppler (cf. Geek Box 11.2) 
can be acquired in 1-D and 2-D, and with the most recent generations of 
transducers also in 3-D. 


11.3.4.1 A-Mode (Amplitude Mode) 


A-mode is the simplest scanning mode. The height of the amplitude of the 
reflected ultrasound is displayed over the sonic runtime in the sonic ray direc- 
tion. Extractable measurements are: frequency, modulated frequency, height 
of the impulse/amplitude, runtime, wave phase, phase shift, and attenuation. 
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However, the major disadvantage is that only very localized information (one 
single line through the body) is acquired. 

The backward scattered ultrasound intensity along a single ray is called A- 
mode. From a continuously running high-frequency generator, a wave packet 
is cut out with a “gate” and is passed to the transducer. The returning echo is 
given through a duplexer to a time-dependent amplifier (Time Gain Compen- 
sation). Later arriving echoes, which are weaker because of the absorption, 
are more amplified than the signals from the surface. Signals of high depth 
(15 cm) are raised up to 120 dB. The signal height of an interface reflected sig- 
nal is independent of the penetration depth from which the echo comes. The 
signal-to-noise ratio becomes worse with increasing depth. The next sonic 
impulse will be emitted if all echoes of the preliminary sonic impulse are de- 
cayed. The repeat rate depends on the penetration depth and therewith on 
the used frequency. 


11.3.4.2 B-Mode (Brightness Mode) 


B-mode is the most common ultrasound mode. B-mode images are generated 
by systematically combining a multitude of A-mode (1-D) scans into a single 
2-D image, where the intensity of a pixel is defined by the amplitude of the 
corresponding ultrasonic ray. In brief, in order to acquire 2-D images of the 
inner body, the ultrasound device has to sample not only on a 1-D ray (as in 
A-mode), but on a 2-D plane in 3-D space. Hence, various rays are sent in 
different directions. To achieve this, two techniques are commonly used: the 
mechanical and the electronic method. 


Mechanical scanners The transducer librates in front of the patient, with- 
out any external movement of the gaging head. Thus, a slice of the human 
body is represented in the form of a circle segment. The intensity of the echo 
is transformed into gray scales and is inserted into an image matrix (B-Mode). 
An image consists of a fan of typically 100 lines. 


Electronic scanners (linear/curved arrays) Here, many (60 to 100) and 
very small (0.5 mm to 1 mm) transducers are used, which are arranged in a 
row (“array”). A group of transducers is activated simultaneously. For scan- 
ning, the whole group of elements is shifted. With a curved arrangement of 
transducers, an image detail can be represented as a circle segment. 


Electronic scanners (phased arrays) Every transducer element of an 
array can be accessed for both sending and receiving with an individual ad- 
justable delay. 
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Geek Box 11.1: 3-D Ultrasound 


In 3-D ultrasound imaging, several 2-D images (B-mode) at differ- 
ent angles (w.r.t. axial direction) are combined into one 3-D volume. 
Real-time processing and visualization (rendering) of 3-D ultrasound 
images (volumes) requires high computational power, where graphics 
processing units can be used. Arbitrary section planes and “virtual 
travels through the body” are possible. The first 3-D ultrasound sys- 
tem was reported by Kazunori Baba in 1984. Slowly but steadily, 3-D 
ultrasound is becoming the standard of care in various medical fields 
(e. g., echocardiography), where 2-D imaging was traditionally used. 
One common application is to show their children to parents even 


before birth. An example is found right below this text. 


11.3.4.3 M-Mode (Motion Mode) 


In motion mode, ultrasonic pulses are emitted from the transducer in quick 
succession without movement of the transducer. Either an A-mode or a B- 
mode image is acquired each time. This allows for time-dependent measure- 
ment of organ movement relative to the probe. Thus, the velocity of specific 
organ structures can be obtained. This can be useful, for instance, when the 
movement of the cardiac wall (myocardium) is to be analyzed (echocardio- 


graphy). 


11.4 Safety Aspects 


Ultrasound imaging offers many benefits over other imaging techniques, in- 
cluding: 


e Non-invasiveness (no injections or needles in most cases) and mostly pain- 
less. 
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Geek Box 11.2: 3-D Ultrasound 


Medical Doppler ultrasonography enables the measuring and visu- 
alization of blood flow (blood velocities). Two modes are frequently 
used: continuous wave (CW) Doppler and pulsed wave (PW) Doppler. 
In CW Doppler, half of the transducer array emits, and the other half 
detects pulses. It has the advantage that it allows for continuous imag- 
ing due to simultaneous emission and detection. However, no distance 
information can be measured. In PW Doppler, which is pulse-based, 
distance information can be obtained using time-gating. However, no 
continuous imaging is possible. 

Doppler ultrasonography exploits the well-known Doppler effect. The 
Doppler effect is named after its discoverer Christian Johann Doppler 
(1803-1853) and can be observed in various situations, for instance, 
the noise of the siren of the ambulance when an ambulance passes at 
high speed. Other examples can be found in astronomy: the astronom- 
ical red-shift. Most relevant to medical Doppler ultrasonography, how- 
ever, is blood flow, i.e., Doppler ultrasonography can visualize blood 
velocities. The Doppler effect describes the change in wave frequency 


by a relative movement between source and observer. A characteristic 
frequency shift appears, which is proportional to the relative velocity. 
Doppler ultrasonography aims at measuring the shift in frequency to 
estimate velocities (e. g., of blood in vessels). 


Blood vessel 


In Doppler blood flow imaging, the source are the moving blood cells, 
at which the waves scatter. The observer is the ultrasound transducer. 
The smaller the angle between the direction of blood flow in the vessel 
and the ultrasound wave direction, the better the Doppler effect can 
be exploited. 
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e Image acquisition is fast and relatively easy to learn. 

e No ionizing radiation (contrary to X-ray/CT). 

e Large number of potential applications: ultrasound can visualize structure, 
movement, and function of the body's organs and blood vessels. 


However, ultrasound waves can harm the body: 


e through heating, proportional to absorbed acoustic intensity, or 
e through cavitation, which means gas bubbles that emerge in the low pres- 
sure phases of sound waves and collapse at high pressure phases. 


Since acoustic intensities for medical diagnostics are rather low, the po- 
tentially harmful effects described above have proven to be harmless. Medical 
ultrasound is considered one of the least harmful imaging techniques available 
today and is even used during pregnancy. 

Therapeutical use of ultrasound can be found in gallstone and kidney stone 
therapies, where high intensity localized ultrasound is used to break up the 
stones. The heating effect of ultrasound waves can further be used to destroy 
diseased or cancerous tissue. 
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OCT is an interferometry based three-dimensional imaging modality that 
can be used on scattering media, including several types of body tissues. It 
provides physicians with in-situ image data in micrometer resolution within 
seconds. OCT’s working principle is similar to ultrasound but uses light in- 
stead of sound waves and is also free of potentially harmful ionizing radiation 
while being non-invasive. 

OCT in ophthalmology (the branch of medicine concerned with the eyes) 
has been pioneered by David Huang, Eric Swanson, and James G. Fujimoto 
and has since become a standard modality and is widely used by clinicians 
on a daily basis. Since then, OCT has been continuously developed further, 
providing significant increases in imaging speed and resolution. 


12.1 Working Principle of OCT 


OCT uses low-coherence interferometry to determine depth and reflectivity 
within scattering tissues. In order to understand this process, we recall some 
basic properties of light and waves from the previous chapters. 
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Figure 12.1: Patient being imaged with a commercial OCT device. Image 
courtesy of Carl Zeiss Meditec AG. 


vitreous humor — retinal pigment epithelium 


macula — TWEEN 


— choroid 


Figure 12.2: OCT B-scan of the retina. Brighter pixels indicate tissue which 
reflects more light. The upper portion of the figure shows the vitreous humor 
which has very low reflectivity. The small pit in the center is the macula, the 
center of vision with the highest resolution. The lowest horizontal bright band 
corresponds to the retinal pigment epithelium and the cloud-like structure 
below depicts the choroid, a blood vessel network supplying the retina with 
nutrients and oxygen. 


e Light exhibits properties of particles and waves of which only the latter 
are relevant for this chapter. Light’s electromagnetic wave properties form 
the basis for OCT. 

e Coherence: two waves (or their sources) are described as being coherent 
with each other, when they have matching wavelengths and the same shift 
in phase. 

e Interference: coherent waves superpose with each other (superposition 
principle) and can cancel each other out (destructive interference) or re- 
inforce each other (constructive interference). 

e Bandwidth describes the width of the spectrum that a light source emits. 
In contrast, a light source which is monochromatic, only emits light with 
one wavelength. Such a light source has a bandwidth of 0. 
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Figure 12.3: Michelson interferometer. Half of the light from the light source 
travels to mirrors M; and M^» each, before arriving at the detector. Differences 
in path lengths lead to interference. 


12.1.1 Michelson Interferometer 


'To observe interference of light, interferometers are used. Fig. 12.3 shows 
a Michelson interferometer. It splits light, coming from a source, into two 
different paths, where the light can be treated differently, and merges the 
light, coming back from these two paths, to create interference. Light is split 
at the semi-transparent mirror in the center and half of it is reflected to- 
wards mirror M; while the other half passes through the semi-transparent 
mirror towards mirror Mə. Mirror Mj reflects the light back towards the 
semi-transparent mirror where half of the light passes through to a detector. 
Half of the light coming from mirror Mə is reflected by the semi-transparent 
mirror and also travels to the detector. Interference occurs along the distance 
between the central semi-transparent mirror and the detector. The distance 
that light travels is called path length and the two paths that the light takes 
are called arms. If the distances between the semi-transparent mirror and the 
mirrors Mı and Ms» are equal, the path lengths are equal and constructive 
interference will occur. 

'The detector does not directly detect the waves that form the electromag- 
netic field, but it detects the intensity of the light, averaged over a small time 
span, with the detected intensity J being the square of the electromagnetic 
field E 

Icd. (12.1) 


12.1.2 Coherence Length 


In practice, interference is limited by the coherence length. The coherence 
length describes how big the difference in path lengths can be for interference 
to occur. Is the difference in path lengths greater than the coherence length, 
no interference can be observed. Coherence length is inversely proportional 
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Geek Box 12.1: Coherence Length 
The coherence length le of a light source is calculated by 
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(12.2) 
with Ag being the central wavelength of the light source and AA its 
bandwidth. As can be seen, a higher bandwidth leads to a smaller 
coherence length. 
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The upper half of the plot shows two reflectors with different reflectiv- 
ities at different distances (as Dirac impulses). If the reference mirror 
is moved to match the path length of the reflectors, the measured in- 
tensity becomes maximal. Lower coherence lengths also increase res- 
olution. 


to the bandwidth (see Geek Box 12.1 for more details on coherence length). 
Now, if the Michelson interferometer uses a low-coherence light source (a 
light source which emits a spectrum), the coherence length can be used to 
determine the distance of a reflector in one of the interferometer’s arms by 
gradually moving the mirror in the other arm. Fig. 12.4 illustrates this, where 
by moving mirror M, to match the distances zm, and zy, will generate an 
intensity peak in the detector. The plot in Geek Box 12.1 shows how the 
intensity peaks when zm, and zm, are matched. 


12.2 Time Domain OCT 


The principle of low-coherence interferometry is used by OCT to image scat- 
tering samples. The Michelson interferometer is adapted replacing one mir- 
ror (Mə in this case) with a sample (e. g. a patient's eye) to be imaged (cf. 
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Figure 12.4: Michelson interferometer with low-coherence light source to 


measure the distance zy,. Mirror Mı is moved to match the distances zy, 
and zy, which will generate an intensity peak in the detector. 
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Figure 12.5: Setup of a time-domain OCT system, one mirror has been 
replaced with a sample. The other mirror can move to acquire an A-scan. 
'The mirror is located in the reference arm, the sample in the sample arm. 


Fig. 12.5). The remaining mirror Mı forms part of the reference arm, whereas 
the sample becomes part of the sample arm. The sample has to be translucent 
enough to permit light to travel through it and to reflect back from different 
layers. Thus, movement of the mirror over time results in a depth profile of 
intensities of reflection at one position of the sample. This is called an A-scan. 
Directing the beam along a line across the sample, while acquiring A-scans 
at regular intervals, yields a two-dimensional image which is called a B-scan. 
Creating a raster scan of B-scans yields a volume. Every pixel column in 
Fig. 12.2 is an A-scan. The moving mirror is a disadvantage though, since it 
limits the maximum sampling speed of the OCT device. 
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Figure 12.6: The OCT beam raster-scans the surface of the retina. Moving 
the beam along a line results in a B-scan (2-D image). Every column in a B- 
scan image is an A-scan. After each B-scan, the beam travels to the beginning 
of the next one. 


12.3 Fourier Domain OCT 


Modern Fourier domain OCT systems work differently. The spectrum of the 
A-scan can be acquired simultaneously and the moving mirror in the reference 
arm becomes unnecessary. Since we acquire the spectrum of the A-scan, we 
can apply an inverse Fourier-transform which yields the respective A-scan. 
This enables significantly higher acquisition speeds since the OCT device 
does not contain moving parts anymore. 

Fourier domain OCT can be grouped into two variants. The first one is 
Spectral-domain OCT, where a spectrometer acquires the spectrum. The 
speed is limited by how fast the spectrometer can acquire the spectrum. 
Currently, resolutions of 3 ym with a scanning speed of up to 312.500 A- 
scans per second can be achieved. 

The second one is swept-source OCT, where the light source sweeps across 
a spectrum and a detector samples the spectrum over time. The speed limit is 
set by how fast the light source can sweep across the spectrum, but the speed 
is generally higher than the speed of spectrometers used for Spectral-domain 
OCT. Resolutions of 5 pm while scanning 800.000 to 3.350.000 A-scans per 
second are currently possible in research systems. 


12.4 OCT Angiography 


OCT devices operate in the infrared light regime with wavelengths in the 
micrometer regime. Blood cells have diameters that lie in a similar range, 
i. e., white blood cells have diameters of 10-12 pm, red blood cells of 6-8 pm, 
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Figure 12.7: 3-D OCT angiography results in a layered reconstruction of 
the vessels for each retinal layer. Here we show a wide 12 mm x 12 mm field 
of view of the superficial and deep retina as well as the choroid (from top to 
bottom). Image data courtesy of New England Eye Center, USA. 


and platelets of 2-3 pm. This size is just about right to induce high speckle 
noise in the image. In OCT angiography, this effect is exploited to create 
a visualization of vessels without the need of contrast agent. The idea is to 
scan the same area of the retina multiple times to generate a map of variance. 
This map will have a high response in areas that contain vessels. Using the 
structural OCT image (cf. Fig. 12.2), the retinal layers are then segmented 
and used to create projections of each layer. Fig. 12.7 shows such projections 
for the superficial and deep vascular plexi as well as the choroid. In Geek 
Box 12.2, we detail measures for OCT angiography reconstruction. Note that 
comparison of scans that were acquired in rapid sequence also allows for the 
estimation of blood flow speed. This topic is scope of current OCT research. 


12.5 Applications 


OCT is predominantly used for imaging the eye. However, its application is 
also quite common in other body regions. In the following, we summarize 
shortly OCT's fields of application. 


e Ophthalmic Imaging: Retinal imaging is currently the major application 
for OCT. Both the retina and anterior eye can be imaged for diagnos- 
tic purposes completely non-invasively in 3-D. Furthermore, as described 
above, the vessel structure can also be investigated in 3-D without the use 
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Geek Box 12.2: OCT Angiography Signal Generation 


In order to quantify the variance in OCT images, several measures 
have been proposed. Speckle variance assumes a normal distribution 
to compute the signal variance 


N— 
Cy = > DA Jp qa (12.3) 


where J,, are the individual structural measurements and I their cor- 
responding mean value. 

In order to accommodate the acquisition sequence, above concept can 
be expanded to only compare neighboring acquisitions. The resulting 
method is called inter-frame variance 


(12.4) 


Note that this measure again uses a normal distribution as underly- 
ing assumption. This time, however, we assume that the inter-frame 
differences are normally distributed and their mean is 0. 

Another extension to this is the so-called amplitude decorrelation in 
which we introduce additional scaling to the variance computation. 
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This concept is very similar to inter-frame variance, however, a local 
scaling of ,/I?_, + I2 is introduced for every amplitude difference. 
Doing so, amplitude decorrelation is always scaled between 0 and 1 
and therefore can be interpreted as an “inverse correlation” where 0 is 
obtained for correlated observations and 1 for independent measure- 
ments. 


of contrast agent. As such, OCT has become the standard of care for the 
diagnosis of eye diseases. Fig. 12.8 shows a volume of the anterior eye and 
part of the retina. 

e Cardiovascular Imaging: OCT can be used to diagnose cardiovascular dis- 
eases. In order to do so, optical fibers are embedded into a catheter that is 
inserted minimally invasively into the vessel system. Doing so, the vessel 
wall can be imaged and areas of concern can be investigated. These are 
typically calcifications and plaques that are attached to the vessel wall. 
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Figure 12.8: OCT volume showing the structure of the cornea, lens, and 
iris of the anterior eye. The disc in the background is part of the retina which 
is visible through the lens. These volumes are used in the visualization and 
diagnosis of corneal pathologies and glaucoma. 


_calcified plaque 


OCT probe in catheter — 


shadow from OCT probe 


Figure 12.9: B-scan from a blood vessel. The small circle in the middle is 
the OCT probe within the dark lumen. The bright ring around the lumen is 
the vessel’s endothelium (inner surface). The gap on the right side is caused 
by constructional properties of the probe. A calcified plaque is visible in the 
top right quadrant of the endothelium. 


Fig. 12.9 shows a cross section of a blood vessel. A rotating mirror is 
mounted at the tip of the catheter and deflects the OCT beam into the 
tissue around the probe. OCT offers higher resolution when compared to 
intravascular ultrasound. 

e Gastrointestinal Imaging: OCT is also used in gastrointestinal imaging, 
where it might have the potential to enable earlier detection and prevention 
of cancer. Current research investigates application in the esophagus and 
the colon. 

e Dermatology: OCT angiography is investigated to detect skin cancer which 
has increased blood flow due to rapid growth of cancerous cells. Again, the 
combination of structural and functional imaging potentially can enable 
new ways of treatment. This topic is scope of current research. 
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